论文标题
谁的梦想?寻找数据注释中的吸引力
Whose AI Dream? In search of the aspiration in data annotation
论文作者
论文摘要
本文从注释者的角度介绍了数据注释的实践。数据对ML模型至关重要。本文研究了印度行业中有关数据注释的工作实践。先前的研究主要集中在注释器主观性,偏见和效率上。我们介绍了数据注释的更广泛观点,在采用扎根的方法之后,我们对25名注释者,10名行业专家和12名ML从业者进行了三套访谈。我们的结果表明,注释者的工作取决于其站点上方的他人的利益,优先事项和价值观。我们认为,数据注释是通过组织结构和实践进行系统的系统行使的。我们提出了一系列意义,即我们如何培养和鼓励更好地实践以低成本的高质量数据的需求与对福祉,职业视角的注释愿望以及积极参与建立AI梦想之间的张力。
This paper present the practice of data annotation from the perspective of the annotators. Data is fundamental to ML models. This paper investigates the work practices concerning data annotation as performed in the industry, in India. Previous investigations have largely focused on annotator subjectivity, bias and efficiency. We present a wider perspective of the data annotation, following a grounded approach, we conducted three sets of interviews with 25 annotators, 10 industry experts and 12 ML practitioners. Our results show that the work of annotators is dictated by the interests, priorities and values of others above their station. More than technical, we contend that data annotation is a systematic exercise of power through organizational structure and practice. We propose a set of implications for how we can cultivate and encourage better practice to balance the tension between the need for high quality data at low cost and the annotator aspiration for well being, career perspective, and active participation in building the AI dream.