关于室外设置中解释的实证研究

论文标题

关于室外设置中解释的实证研究

An Empirical Study on Explanations in Out-of-Domain Settings

论文作者

Chrysostomou, George, Aletras, Nikolaos

论文摘要

自然语言处理中的最新工作集中在开发忠实解释的方法上，要么是通过识别输入中最重要的令牌（即事后解释）或设计固有的忠实忠实的模型，然后首先选择最重要的标记，然后使用它们来预测正确的标签（即精选的预测模型）。当前，这些方法在大量域内进行了评估。然而，关于事后解释和固有的忠实模型在室外设置中的表现知之甚少。在本文中，我们进行了一项广泛的实证研究，研究了：（1）五个特征归因方法产生的事后解释的跨域忠诚；（2）六个数据集上两个天生忠实的模型的外域性能。与我们的期望相反，结果表明，在许多情况下，与内域相比，通过充足和全面性衡量的事后解释忠诚度更高。我们发现这种误导性，并建议使用随机基线作为评估事后解释忠诚的标准。我们的发现还表明，精选的预测模型表明，室外设置与全文训练的模型相当的预测性能。

Recent work in Natural Language Processing has focused on developing approaches that extract faithful explanations, either via identifying the most important tokens in the input (i.e. post-hoc explanations) or by designing inherently faithful models that first select the most important tokens and then use them to predict the correct label (i.e. select-then-predict models). Currently, these approaches are largely evaluated on in-domain settings. Yet, little is known about how post-hoc explanations and inherently faithful models perform in out-of-domain settings. In this paper, we conduct an extensive empirical study that examines: (1) the out-of-domain faithfulness of post-hoc explanations, generated by five feature attribution methods; and (2) the out-of-domain performance of two inherently faithful models over six datasets. Contrary to our expectations, results show that in many cases out-of-domain post-hoc explanation faithfulness measured by sufficiency and comprehensiveness is higher compared to in-domain. We find this misleading and suggest using a random baseline as a yardstick for evaluating post-hoc explanation faithfulness. Our findings also show that select-then predict models demonstrate comparable predictive performance in out-of-domain settings to full-text trained models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题