解释反事实增强数据的功效

论文标题

解释反事实增强数据的功效

Explaining The Efficacy of Counterfactually Augmented Data

论文作者

Kaushik, Divyansh, Setlur, Amrith, Hovy, Eduard, Lipton, Zachary C.

论文摘要

为了尝试生产NLP数据集中的伪造模式的ML模型，研究人员最近通过人类在循环过程中策划了反事实增强的数据（CAD），在该过程中，给出一些文档及其（初始）标签，人类必须修订文本以使文本适用于相反的标签。重要的是，禁止不需要翻转适用标签的编辑。从经验上讲，在增强数据上训练的模型较少依赖语义上无关的单词，并从域中概括。尽管这项工作松散地借鉴了因果思维，但基本的因果模型（即使是在抽象的层面上），而观察到的室外改善的原则尚不清楚。在本文中，我们介绍了一个基于线性高斯模型的玩具模拟，观察了因果模型，测量噪声，室外概括以及对虚假信号的依赖之间的有趣关系。我们的分析提供了一些见解，有助于解释CAD的功效。此外，我们提出了这样的假设，即在因果特征中增加噪声应降低内域和范围外部性能，但在非c-cocal特征中添加噪声应导致相对改善，从而导致外域性能的相对改善。这个想法激发了一项推测性测试，用于确定特征归因技术是否已经确定了因果关系。如果将噪声（例如，通过随机单词翻转）向突出显示的跨度跨越域内和域外性能降解，则在一系列挑战数据集中的性能，但在补充中增加噪音会导致改进的改进，这表明我们已经确定了因果关系跨度。我们提出了一项大规模的经验研究，比较了跨越跨度的跨度，该跨度将CAD与注意力图和显着性图选择的跨度。在众多领域和模型中，我们发现假设的现象对于CAD而言是明显的。

In attempts to produce ML models less reliant on spurious patterns in NLP datasets, researchers have recently proposed curating counterfactually augmented data (CAD) via a human-in-the-loop process in which given some documents and their (initial) labels, humans must revise the text to make a counterfactual label applicable. Importantly, edits that are not necessary to flip the applicable label are prohibited. Models trained on the augmented data appear, empirically, to rely less on semantically irrelevant words and to generalize better out of domain. While this work draws loosely on causal thinking, the underlying causal model (even at an abstract level) and the principles underlying the observed out-of-domain improvements remain unclear. In this paper, we introduce a toy analog based on linear Gaussian models, observing interesting relationships between causal models, measurement noise, out-of-domain generalization, and reliance on spurious signals. Our analysis provides some insights that help to explain the efficacy of CAD. Moreover, we develop the hypothesis that while adding noise to causal features should degrade both in-domain and out-of-domain performance, adding noise to non-causal features should lead to relative improvements in out-of-domain performance. This idea inspires a speculative test for determining whether a feature attribution technique has identified the causal spans. If adding noise (e.g., by random word flips) to the highlighted spans degrades both in-domain and out-of-domain performance on a battery of challenge datasets, but adding noise to the complement gives improvements out-of-domain, it suggests we have identified causal spans. We present a large-scale empirical study comparing spans edited to create CAD to those selected by attention and saliency maps. Across numerous domains and models, we find that the hypothesized phenomenon is pronounced for CAD.

下载PDF全文

下载文献需遵守相关版权规定

论文标题