论文标题
对抗培训的建议,在建议中学习受保护的用户属性
Unlearning Protected User Attributes in Recommendations with Adversarial Training
论文作者
论文摘要
协作过滤算法捕获了基本的消费模式,包括特定的特定人口统计信息或用户的受保护信息,例如性别,种族和位置。这些编码的偏见可能会影响推荐系统(RS)的决策,以进一步分离提供给各种人口亚组的内容,并引发有关披露用户受保护属性的隐私问题。在这项工作中,我们研究了从RS算法的学习交互表示中删除用户特定保护信息的可能性和挑战,同时保持其有效性。具体而言,我们将对抗性训练纳入了最先进的多节结构中,从而带来了一种新型模型,具有多项式可能性(Adv-Multvae)的对抗性变异自动编码器(Adv-Multvae),旨在消除保护建议绩效的保护属性的隐式信息。我们对Movielens-1M和LFM-2B- demias数据集进行了实验,并基于外部攻击者无法从模型中揭示用户的性别信息来评估偏差缓解方法的有效性。与基线多体相比,结果表明,adv-multvae的性能边际恶化(W.R.T. NDCG和召回),在两个数据集中都在很大程度上减轻了模型中固有的偏见。
Collaborative filtering algorithms capture underlying consumption patterns, including the ones specific to particular demographics or protected information of users, e.g. gender, race, and location. These encoded biases can influence the decision of a recommendation system (RS) towards further separation of the contents provided to various demographic subgroups, and raise privacy concerns regarding the disclosure of users' protected attributes. In this work, we investigate the possibility and challenges of removing specific protected information of users from the learned interaction representations of a RS algorithm, while maintaining its effectiveness. Specifically, we incorporate adversarial training into the state-of-the-art MultVAE architecture, resulting in a novel model, Adversarial Variational Auto-Encoder with Multinomial Likelihood (Adv-MultVAE), which aims at removing the implicit information of protected attributes while preserving recommendation performance. We conduct experiments on the MovieLens-1M and LFM-2b-DemoBias datasets, and evaluate the effectiveness of the bias mitigation method based on the inability of external attackers in revealing the users' gender information from the model. Comparing with baseline MultVAE, the results show that Adv-MultVAE, with marginal deterioration in performance (w.r.t. NDCG and recall), largely mitigates inherent biases in the model on both datasets.