论文标题
两个头和一个头一样吗?在公平的神经网络中识别不同的治疗
Are Two Heads the Same as One? Identifying Disparate Treatment in Fair Neural Networks
论文作者
论文摘要
我们表明,经过种族或性别意识的形式,经过培训以满足人口统计学奇偶校验的深层网络,我们越强迫网络变得公平,我们就越准确地从网络的内部状态中恢复种族或性别。基于此观察结果,我们研究了一种替代性公平方法:我们将第二个分类头添加到网络中,以明确预测受保护的属性(例如种族或性别)以及原始任务。在训练了两头网络之后,我们通过合并两个主角来实施人口统计学奇偶校验,创建一个与原始网络相同的架构网络。我们通过表明(1)表明公平分类器的决策被我们的方法迅速抗解,并且(2)可以从公平的分类器中恢复不公平且最佳准确的分类器,从而从公平的分类器和我们的第二个主管中恢复不公平且最佳的准确分类器,从而在现有方法和方法之间建立了密切的关系,并且(2)可以预测受保护的属性。我们使用明确的配方来争辩说,现有的公平方法就像我们的公平疗法一样,表现出不同的待遇,并且根据美国法律,它们在广泛的情况下可能是非法的。
We show that deep networks trained to satisfy demographic parity often do so through a form of race or gender awareness, and that the more we force a network to be fair, the more accurately we can recover race or gender from the internal state of the network. Based on this observation, we investigate an alternative fairness approach: we add a second classification head to the network to explicitly predict the protected attribute (such as race or gender) alongside the original task. After training the two-headed network, we enforce demographic parity by merging the two heads, creating a network with the same architecture as the original network. We establish a close relationship between existing approaches and our approach by showing (1) that the decisions of a fair classifier are well-approximated by our approach, and (2) that an unfair and optimally accurate classifier can be recovered from a fair classifier and our second head predicting the protected attribute. We use our explicit formulation to argue that the existing fairness approaches, just as ours, demonstrate disparate treatment and that they are likely to be unlawful in a wide range of scenarios under US law.