论文标题
减少对多类数据集的二进制方法
A Reduction to Binary Approach for Debiasing Multiclass Datasets
论文作者
论文摘要
我们提出了一种新型的还原到二进制(R2B)方法,该方法通过减少对一系列二进制偏差任务的序列来实施具有非二进制敏感属性的多类分类的人口统计学差异。我们证明,R2B满足最优性和偏见可以保证,并从经验上证明它可以改善两个基准:(1)通过独立借记标签和(2)转换特征而不是标签,将多类问题视为多标签。令人惊讶的是,我们还证明了独立的标签偏见会在大多数(但不是全部)设置中产生竞争性结果。我们验证了社会科学,计算机视觉和医疗保健的合成和现实数据集的这些结论。
We propose a novel reduction-to-binary (R2B) approach that enforces demographic parity for multiclass classification with non-binary sensitive attributes via a reduction to a sequence of binary debiasing tasks. We prove that R2B satisfies optimality and bias guarantees and demonstrate empirically that it can lead to an improvement over two baselines: (1) treating multiclass problems as multi-label by debiasing labels independently and (2) transforming the features instead of the labels. Surprisingly, we also demonstrate that independent label debiasing yields competitive results in most (but not all) settings. We validate these conclusions on synthetic and real-world datasets from social science, computer vision, and healthcare.