论文标题

跨域学习,用于在线内容中分类宣传

Cross-Domain Learning for Classifying Propaganda in Online Contents

论文作者

Wang, Liqiang, Shen, Xiaoyu, de Melo, Gerard, Weikum, Gerhard

论文摘要

随着新闻和社交媒体表现出越来越多的操纵两极化内容,检测到这种宣传已引起关注,作为内容分析的新任务。先前的工作专注于从同一领域的培训数据进行监督学习。但是,由于宣传可能是微妙的并且不断发展,因此手动识别和适当的标签非常苛刻。结果,培训数据是主要的瓶颈。在本文中,我们根据新闻和推文的标签文件和句子以及政治演讲的宣传程度明显差异,并提出了一种利用跨域学习的方法,以利用跨域学习。我们使用跨域学习设计了内容丰富的功能,并构建了各种宣传标签的分类器。我们的实验证明了这种方法的有用性,并确定了转移步骤的各种源和目标配置中的困难和局限性。我们进一步分析了各种特征的影响,并表征了宣传的显着指标。

As news and social media exhibit an increasing amount of manipulative polarized content, detecting such propaganda has received attention as a new task for content analysis. Prior work has focused on supervised learning with training data from the same domain. However, as propaganda can be subtle and keeps evolving, manual identification and proper labeling are very demanding. As a consequence, training data is a major bottleneck. In this paper, we tackle this bottleneck and present an approach to leverage cross-domain learning, based on labeled documents and sentences from news and tweets, as well as political speeches with a clear difference in their degrees of being propagandistic. We devise informative features and build various classifiers for propaganda labeling, using cross-domain learning. Our experiments demonstrate the usefulness of this approach, and identify difficulties and limitations in various configurations of sources and targets for the transfer step. We further analyze the influence of various features, and characterize salient indicators of propaganda.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源