论文标题

弱代理人足够且可取

Weak Proxies are Sufficient and Preferable for Fairness with Missing Sensitive Attributes

论文作者

Zhu, Zhaowei, Yao, Yuanshun, Sun, Jiankai, Li, Hang, Liu, Yang

论文摘要

评估公平性在实践中可能具有挑战性,因为由于隐私限制,数据的敏感属性通常无法访问。该行业经常采用的首选方法是使用现成的代理模型来预测缺失的敏感属性,例如Meta [Alao等,2021]和Twitter [Belli等,2022]。尽管它很受欢迎,但仍有三个重要的问题尚未解决:(1)直接使用代理有效地衡量公平? (2)如果没有,是否只能仅使用代理来准确评估公平性? (3)鉴于推断用户私人信息的道德争议,是否只能使用弱(即不准确)代理来保护隐私?我们的理论分析表明,直接使用代理模型可以给人(未)公平的错误感。其次,我们开发了一种仅使用三个正确识别的代理,能够准确地测量公平性的算法。第三,我们表明我们的算法允许仅使用弱代理(例如,Compas的精度仅为68.85%),从而在用户隐私方面增加了额外的保护层。实验验证了我们的理论分析,并表明我们的算法可以有效地测量和减轻偏见。我们的结果暗示了有关如何正确使用代理的一套实践指南。代码可在github.com/ucsc-real/fair-eval上找到。

Evaluating fairness can be challenging in practice because the sensitive attributes of data are often inaccessible due to privacy constraints. The go-to approach that the industry frequently adopts is using off-the-shelf proxy models to predict the missing sensitive attributes, e.g. Meta [Alao et al., 2021] and Twitter [Belli et al., 2022]. Despite its popularity, there are three important questions unanswered: (1) Is directly using proxies efficacious in measuring fairness? (2) If not, is it possible to accurately evaluate fairness using proxies only? (3) Given the ethical controversy over inferring user private information, is it possible to only use weak (i.e. inaccurate) proxies in order to protect privacy? Our theoretical analyses show that directly using proxy models can give a false sense of (un)fairness. Second, we develop an algorithm that is able to measure fairness (provably) accurately with only three properly identified proxies. Third, we show that our algorithm allows the use of only weak proxies (e.g. with only 68.85%accuracy on COMPAS), adding an extra layer of protection on user privacy. Experiments validate our theoretical analyses and show our algorithm can effectively measure and mitigate bias. Our results imply a set of practical guidelines for practitioners on how to use proxies properly. Code is available at github.com/UCSC-REAL/fair-eval.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源