论文标题
调查核心理论在神经核心分辨率系统背景下的作用
Investigating the Role of Centering Theory in the Context of Neural Coreference Resolution Systems
论文作者
论文摘要
居中理论(CT; Grosz等,1995)对话语结构进行了语言分析。根据理论,话语的当地连贯性是从连续的话语引用相同实体的方式和程度引起的。在本文中,我们研究了居中理论与现代核心分辨率系统之间的联系。我们通过定义各种话语指标并开发基于搜索的方法来提供居中的操作,并系统地研究神经核心分辨率是否遵守中心理论规则。我们的信息理论分析揭示了核心和核心之间的积极依赖。但还表明,高质量的神经核心排列者可能不会从明确建模的居中思想中受益。我们的分析进一步表明,上下文化的嵌入包含许多连贯性信息,这有助于解释为什么CT只能为现代的神经核心解析器提供几乎没有利用验证表示的分解器的收益。最后,我们讨论了导致核心的因素,而这些因素不是由世界知识和新近度偏见等CT建模的因素。我们制定了一个CT版本,该版本还对新近度进行建模,并表明与Vanilla CT相比,它可以更好地捕获核心信息。
Centering theory (CT; Grosz et al., 1995) provides a linguistic analysis of the structure of discourse. According to the theory, local coherence of discourse arises from the manner and extent to which successive utterances make reference to the same entities. In this paper, we investigate the connection between centering theory and modern coreference resolution systems. We provide an operationalization of centering and systematically investigate if neural coreference resolvers adhere to the rules of centering theory by defining various discourse metrics and developing a search-based methodology. Our information-theoretic analysis reveals a positive dependence between coreference and centering; but also shows that high-quality neural coreference resolvers may not benefit much from explicitly modeling centering ideas. Our analysis further shows that contextualized embeddings contain much of the coherence information, which helps explain why CT can only provide little gains to modern neural coreference resolvers which make use of pretrained representations. Finally, we discuss factors that contribute to coreference which are not modeled by CT such as world knowledge and recency bias. We formulate a version of CT that also models recency and show that it captures coreference information better compared to vanilla CT.