论文标题
调查众包新闻信誉评估的差异:评估者,任务和专家标准
Investigating Differences in Crowdsourced News Credibility Assessment: Raters, Tasks, and Expert Criteria
论文作者
论文摘要
关于气候变化和疫苗安全等关键问题的错误信息通常会在在线社交和搜索平台上放大。外行人对内容信誉评估的众包被提议作为一种策略,以通过试图在大规模上复制专家的评估来打击错误信息。在这项工作中,我们调查了人群与专家的新闻信誉评估,以了解它们之间的评分何时以及如何不同。我们收集了一个来自2个人群群体的4,000多个信誉评估的数据集 - 新闻学生和UPWork工人 - 以及2个专家团体---新闻工作者和科学家 - - 与气候科学相关的50种新闻文章,这是一个与气候科学有关的新闻文章,这个话题与公众舆论与专家共识之间的广泛脱节。在研究评分时,我们发现由于人群的构成,例如评估者人口统计和政治倾向,以及分配人群被分配给人群的任务范围,例如文章的类型和出版物的党派,绩效差异。最后,由于新闻业与科学专家使用的专家标准不同,我们发现专家评估之间的差异 - 可能导致人群差异的差异,但这也提出了一种方法来减少差距,通过设计针对特定专家标准的人群任务来减少差距。从这些发现中,我们概述了未来的研究方向,以更好地设计针对特定人群和内容类型的人群过程。
Misinformation about critical issues such as climate change and vaccine safety is oftentimes amplified on online social and search platforms. The crowdsourcing of content credibility assessment by laypeople has been proposed as one strategy to combat misinformation by attempting to replicate the assessments of experts at scale. In this work, we investigate news credibility assessments by crowds versus experts to understand when and how ratings between them differ. We gather a dataset of over 4,000 credibility assessments taken from 2 crowd groups---journalism students and Upwork workers---as well as 2 expert groups---journalists and scientists---on a varied set of 50 news articles related to climate science, a topic with widespread disconnect between public opinion and expert consensus. Examining the ratings, we find differences in performance due to the makeup of the crowd, such as rater demographics and political leaning, as well as the scope of the tasks that the crowd is assigned to rate, such as the genre of the article and partisanship of the publication. Finally, we find differences between expert assessments due to differing expert criteria that journalism versus science experts use---differences that may contribute to crowd discrepancies, but that also suggest a way to reduce the gap by designing crowd tasks tailored to specific expert criteria. From these findings, we outline future research directions to better design crowd processes that are tailored to specific crowds and types of content.