论文标题
自动生成相反的关系分类
Automatically Generating Counterfactuals for Relation Classification
论文作者
论文摘要
关系分类(RC)的目的是提取文本中/之间的语义关系。作为自然语言处理中的一项基本任务,至关重要的是确保RC模型的稳健性。尽管当前的RC任务中已经实现了当前的深神经模型,但它们很容易受到虚假相关性的影响。解决此问题的一种解决方案是用反合增强的数据(CAD)训练模型,以便它可以学习因果而不是混杂。但是,尚未尝试为RC任务生成反事实。在本文中,我们从以实体为中心的角度制定了自动为RC任务生成CAD的问题,并开发了一种新颖的方法来为实体提供上下文反事实。具体而言,我们利用两个基本拓扑特性,即句法和语义依赖性图中的中心性和最短路径,以首先识别,然后介绍实体的上下文因果特征。我们通过将我们提出的方法与各种骨干RC模型相结合,对四个RC数据集进行了全面的评估。结果表明,我们的方法不仅可以提高骨干的性能,而且还使它们在室外测试中更加强大。
The goal of relation classification (RC) is to extract the semantic relations between/among entities in the text. As a fundamental task in natural language processing, it is crucial to ensure the robustness of RC models. Despite the high accuracy current deep neural models have achieved in RC tasks, they are easily affected by spurious correlations. One solution to this problem is to train the model with counterfactually augmented data (CAD) such that it can learn the causation rather than the confounding. However, no attempt has been made on generating counterfactuals for RC tasks. In this paper, we formulate the problem of automatically generating CAD for RC tasks from an entity-centric viewpoint, and develop a novel approach to derive contextual counterfactuals for entities. Specifically, we exploit two elementary topological properties, i.e., the centrality and the shortest path, in syntactic and semantic dependency graphs, to first identify and then intervene on the contextual causal features for entities. We conduct a comprehensive evaluation on four RC datasets by combining our proposed approach with a variety of backbone RC models. The results demonstrate that our approach not only improves the performance of the backbones, but also makes them more robust in the out-of-domain test.