论文标题
评估从数据集到模型的人口偏差转移:面部表达识别中的案例研究
Assessing Demographic Bias Transfer from Dataset to Model: A Case Study in Facial Expression Recognition
论文作者
论文摘要
人工智能(AI)的应用量的增加使研究人员研究了这些技术的社会影响并评估其公平性。不幸的是,当前的公平指标很难应用于多级多人口统计学分类问题,例如面部表达识别(FER)。我们建议一组新的指标来解决这些问题。在提出的三个指标中,有两个集中于数据集的代表性和刻板印象偏差,而第三个指标则集中在受过训练的模型的残留偏见上。这些合并的指标可以可能用于研究和比较各种缓解方法。我们通过根据流行的AffectNet数据集将其应用于FER问题来证明这些指标的有用性。像许多其他用于FER的数据集一样,AffectNet是一个大型的Internet源数据集,具有291,651个标记的图像。从Internet获取图像引起了人们对在此数据进行培训的任何系统的公平性及其正确推广到不同人群的能力的担忧。我们首先分析数据集和一些变体,发现实质性的种族偏见和性别刻板印象。然后,我们提取具有不同人群特性的几个子集,并在每个模型上训练一个模型,观察不同设置中残留偏差的量。我们还对其他数据集FER+提供了第二个分析。
The increasing amount of applications of Artificial Intelligence (AI) has led researchers to study the social impact of these technologies and evaluate their fairness. Unfortunately, current fairness metrics are hard to apply in multi-class multi-demographic classification problems, such as Facial Expression Recognition (FER). We propose a new set of metrics to approach these problems. Of the three metrics proposed, two focus on the representational and stereotypical bias of the dataset, and the third one on the residual bias of the trained model. These metrics combined can potentially be used to study and compare diverse bias mitigation methods. We demonstrate the usefulness of the metrics by applying them to a FER problem based on the popular Affectnet dataset. Like many other datasets for FER, Affectnet is a large Internet-sourced dataset with 291,651 labeled images. Obtaining images from the Internet raises some concerns over the fairness of any system trained on this data and its ability to generalize properly to diverse populations. We first analyze the dataset and some variants, finding substantial racial bias and gender stereotypes. We then extract several subsets with different demographic properties and train a model on each one, observing the amount of residual bias in the different setups. We also provide a second analysis on a different dataset, FER+.