论文标题
机器学习作为文化学习的典范:教算法是胖的含义
Machine learning as a model for cultural learning: Teaching an algorithm what it means to be fat
论文作者
论文摘要
当我们在文化环境中浏览时,我们会学习文化偏见,例如性别,社会阶层,健康和体重的文化偏见。但是,目前尚不清楚公共文化如何成为私人文化。在本文中,我们提供了这种文化学习的理论上说明。我们建议神经词嵌入从自然语言中学到的表示形式提供了简约的和认知上的合理模型。使用神经词嵌入,我们从《纽约时报》文章中提取有关体重的文化图式。我们确定了几种将肥胖与性别,不道德,健康状况不佳和社会经济阶层低的文化图案联系在一起。这种模式可能是微妙的,但在公共文化中被广泛激活。因此,语言可以长期繁殖偏见。我们的发现加剧了人们对机器学习也可以编码和繁殖有害的人类偏见的持续担忧。
As we navigate our cultural environment, we learn cultural biases, like those around gender, social class, health, and body weight. It is unclear, however, exactly how public culture becomes private culture. In this paper, we provide a theoretical account of such cultural learning. We propose that neural word embeddings provide a parsimonious and cognitively plausible model of the representations learned from natural language. Using neural word embeddings, we extract cultural schemata about body weight from New York Times articles. We identify several cultural schemata that link obesity to gender, immorality, poor health, and low socioeconomic class. Such schemata may be subtly but pervasively activated in public culture; thus, language can chronically reproduce biases. Our findings reinforce ongoing concerns that machine learning can also encode, and reproduce, harmful human biases.