论文标题

字符号:主观性和审查理解的数据集

SubjQA: A Dataset for Subjectivity and Review Comprehension

论文作者

Bjerva, Johannes, Bhutani, Nikita, Golshan, Behzad, Tan, Wang-Chiew, Augenstein, Isabelle

论文摘要

主观性是无法客观观察或验证的内部意见或信念的表达,并且已被证明对于情感分析和单词义的歧义很重要。此外,主观性是用户生成数据的重要方面。尽管如此,尚未在这些数据广泛的情况下(例如回答(QA))进行主观性。因此,我们在开发新数据集时研究了主观性与质量检查之间的关系。我们将使用最近开发的NLP体系结构时仍然存在的发现与先前工作的分析进行比较和对比,并验证有关主观性的发现。我们发现,在质量保证中,主观性也是一个重要特征,尽管主观性和质量保证性能之间的相互作用更为复杂。例如,一个主观的问题可能与主观答案可能不会相关。我们根据客户评论发布了一个英语质量检查数据集(SubJQA),其中包含问题的主观性注释,并在6个不同的域中回答跨度。

Subjectivity is the expression of internal opinions or beliefs which cannot be objectively observed or verified, and has been shown to be important for sentiment analysis and word-sense disambiguation. Furthermore, subjectivity is an important aspect of user-generated data. In spite of this, subjectivity has not been investigated in contexts where such data is widespread, such as in question answering (QA). We therefore investigate the relationship between subjectivity and QA, while developing a new dataset. We compare and contrast with analyses from previous work, and verify that findings regarding subjectivity still hold when using recently developed NLP architectures. We find that subjectivity is also an important feature in the case of QA, albeit with more intricate interactions between subjectivity and QA performance. For instance, a subjective question may or may not be associated with a subjective answer. We release an English QA dataset (SubjQA) based on customer reviews, containing subjectivity annotations for questions and answer spans across 6 distinct domains.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源