论文标题
我们不会说相同的语言:通过机器翻译解释两极分化
We Don't Speak the Same Language: Interpreting Polarization through Machine Translation
论文作者
论文摘要
美国政党,媒体和精英之间的两极分化是一个广泛研究的话题。跨多个学科的先前研究的突出界限已经观察到并分析了社交媒体中增长的两极分化。在本文中,我们提出了一种新的方法,该方法提供了通过机器翻译镜头来解释极化的新观点。通过一个新颖的主张,即两个子社区正在用两种不同的\ emph {语言}讲话,我们证明了现代的机器翻译方法可以提供一个简单而强大且可解释的框架,以了解两个单词粒度的两个(或更多)大规模社交媒体讨论数据集之间的差异。 Via a substantial corpus of 86.6 million comments by 6.5 million users on over 200,000 news videos hosted by YouTube channels of four prominent US news networks, we demonstrate that simple word-level and phrase-level translation pairs can reveal deep insights into the current political divide -- what is \emph{black lives matter} to one can be \emph{all lives matter} to the other.
Polarization among US political parties, media and elites is a widely studied topic. Prominent lines of prior research across multiple disciplines have observed and analyzed growing polarization in social media. In this paper, we present a new methodology that offers a fresh perspective on interpreting polarization through the lens of machine translation. With a novel proposition that two sub-communities are speaking in two different \emph{languages}, we demonstrate that modern machine translation methods can provide a simple yet powerful and interpretable framework to understand the differences between two (or more) large-scale social media discussion data sets at the granularity of words. Via a substantial corpus of 86.6 million comments by 6.5 million users on over 200,000 news videos hosted by YouTube channels of four prominent US news networks, we demonstrate that simple word-level and phrase-level translation pairs can reveal deep insights into the current political divide -- what is \emph{black lives matter} to one can be \emph{all lives matter} to the other.