论文标题

健康的Twitter讨论?时间会证明

Healthy Twitter discussions? Time will tell

论文作者

Gnatyshak, Dmitry, Garcia-Gasulla, Dario, Alvarez-Napagao, Sergio, Arjona, Jamie, Venturini, Tommaso

论文摘要

研究错误信息以及如何在线讨论中处理不健康的行为最近已成为社会研究中的重要研究领域。随着社交媒体的快速发展以及可用信息和来源的越来越多,对此类话语的严格手动分析变得不可行。许多方法通过在监督方法后研究讨论的语义和句法属性来解决问题,例如,在标记为滥用,假或机器人生成内容的数据集上使用自然语言处理。基于地面真理的存在的解决方案仅限于那些可能具有基础真理的领域。但是,在错误信息的背景下,将标签分配给实例可能很困难甚至不可能。在这种情况下,我们将使用时间动态模式作为讨论健康的指标。在当时无法实现地面真理的领域(covid-19-19大流行讨论),我们根据贡献的数量和时间探讨了讨论的表征。首先,我们以无监督的方式探索讨论的类型,然后使用我们正式化的杂物概念来表征这些类型。最后,我们讨论了根据他们的理想,健康和建设性的方式来对在线话语进行标记的潜在用途。

Studying misinformation and how to deal with unhealthy behaviours within online discussions has recently become an important field of research within social studies. With the rapid development of social media, and the increasing amount of available information and sources, rigorous manual analysis of such discourses has become unfeasible. Many approaches tackle the issue by studying the semantic and syntactic properties of discussions following a supervised approach, for example using natural language processing on a dataset labeled for abusive, fake or bot-generated content. Solutions based on the existence of a ground truth are limited to those domains which may have ground truth. However, within the context of misinformation, it may be difficult or even impossible to assign labels to instances. In this context, we consider the use of temporal dynamic patterns as an indicator of discussion health. Working in a domain for which ground truth was unavailable at the time (early COVID-19 pandemic discussions) we explore the characterization of discussions based on the the volume and time of contributions. First we explore the types of discussions in an unsupervised manner, and then characterize these types using the concept of ephemerality, which we formalize. In the end, we discuss the potential use of our ephemerality definition for labeling online discourses based on how desirable, healthy and constructive they are.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源