论文标题
咖啡:可解释的推荐中的个性化文本生成的反事实公平
COFFEE: Counterfactual Fairness for Personalized Text Generation in Explainable Recommendation
论文作者
论文摘要
随着语言模型越来越多地整合到我们的数字生活中,个性化文本生成(PTG)已成为具有广泛应用的关键组件。但是,用户书面文本中固有的偏差通常用于PTG模型培训,可以无意间将不同级别的语言质量与用户的受保护属性联系起来。该模型可以继承偏差并在生成文本W.R.T.中永久性不平等。用户受保护的属性,在为用户服务时会导致不公平的治疗。在这项工作中,我们在个性化的解释生成中调查了PTG的公平性,以提出建议。我们首先讨论产生的解释及其公平含义的偏见。为了促进公平性,我们引入了一个通用框架,以实现解释产生中特定的反事实公平性。广泛的实验和人类评估证明了我们方法的有效性。
As language models become increasingly integrated into our digital lives, Personalized Text Generation (PTG) has emerged as a pivotal component with a wide range of applications. However, the bias inherent in user written text, often used for PTG model training, can inadvertently associate different levels of linguistic quality with users' protected attributes. The model can inherit the bias and perpetuate inequality in generating text w.r.t. users' protected attributes, leading to unfair treatment when serving users. In this work, we investigate fairness of PTG in the context of personalized explanation generation for recommendations. We first discuss the biases in generated explanations and their fairness implications. To promote fairness, we introduce a general framework to achieve measure-specific counterfactual fairness in explanation generation. Extensive experiments and human evaluations demonstrate the effectiveness of our method.