论文标题
标签耐噪声的卑鄙教学,用于弱监督的虚假新闻检测
Label Noise-Resistant Mean Teaching for Weakly Supervised Fake News Detection
论文作者
论文摘要
假新闻以空前的速度传播,达到全球观众,并向用户和社区带来巨大的风险。大多数现有的虚假新闻检测算法都集中在大量手动标记数据上建立监督培训模型,这些数据很昂贵或通常不可用。在这项工作中,我们提出了一种新颖的标签耐噪声均值教学方法(LNMT),以进行弱监督的假新闻检测。 LNMT利用用户的未标记新闻和反馈意见来扩大培训数据的数量,并通过将精制标签作为弱监管来促进模型培训。具体而言,LNMT会根据新闻内容和评论之间的语义相关性和情感关联自动将初始弱标签分配给未标记的样本。此外,为了抑制弱标签中的噪音,LNMT建立了一个配备标签传播和标签可靠性估算的平均教师框架。该框架衡量了教师和学生网络之间的弱标签相似性矩阵,并传播不同的有价值的弱标签信息以完善弱标签。同时,它利用了两个网络的输出类似然矢量之间的一致性,以评估弱标签的可靠性,并将可靠性纳入模型优化,以减轻噪声弱标签的负面影响。广泛的实验表明了LNMT的出色性能。
Fake news spreads at an unprecedented speed, reaches global audiences and poses huge risks to users and communities. Most existing fake news detection algorithms focus on building supervised training models on a large amount of manually labeled data, which is expensive to acquire or often unavailable. In this work, we propose a novel label noise-resistant mean teaching approach (LNMT) for weakly supervised fake news detection. LNMT leverages unlabeled news and feedback comments of users to enlarge the amount of training data and facilitates model training by generating refined labels as weak supervision. Specifically, LNMT automatically assigns initial weak labels to unlabeled samples based on semantic correlation and emotional association between news content and the comments. Moreover, in order to suppress the noises in weak labels, LNMT establishes a mean teacher framework equipped with label propagation and label reliability estimation. The framework measures a weak label similarity matrix between the teacher and student networks, and propagates different valuable weak label information to refine the weak labels. Meanwhile, it exploits the consistency between the output class likelihood vectors of the two networks to evaluate the reliability of the weak labels and incorporates the reliability into model optimization to alleviate the negative effect of noisy weak labels. Extensive experiments show the superior performance of LNMT.