重新访问基于NLI的摘要的基于NLI的事实评分的文本分解方法

论文标题

重新访问基于NLI的摘要的基于NLI的事实评分的文本分解方法

Revisiting text decomposition methods for NLI-based factuality scoring of summaries

论文作者

Glover, John, Fancellu, Federico, Jagannathan, Vasudevan, Gormley, Matthew R., Schaaf, Thomas

论文摘要

评分生成的摘要的事实涉及测量目标文本使用输入文档作为支持的事实信息的程度。鉴于问题提出的相似之处，以前的工作表明，自然语言推理模型可以有效地重新使用以执行此任务。由于这些模型经过训练以在句子级别上分数分数，因此最近的一些研究表明，将输入文档或摘要分解为句子有助于对事实评分有助于。但是细粒度的分解总是一个胜利的策略吗？在本文中，我们会系统地比较分解的不同粒度 - 从文档到子句子级别，我们表明答案是否定的。我们的结果表明，合并其他上下文可以改善，但这并不一定适用于所有数据集。我们还表明，对先前提出的基于索引的评分方法的小更改可能会导致更好的性能，从而强调了对下游任务的模型和方法论选择的谨慎需求。

Scoring the factuality of a generated summary involves measuring the degree to which a target text contains factual information using the input document as support. Given the similarities in the problem formulation, previous work has shown that Natural Language Inference models can be effectively repurposed to perform this task. As these models are trained to score entailment at a sentence level, several recent studies have shown that decomposing either the input document or the summary into sentences helps with factuality scoring. But is fine-grained decomposition always a winning strategy? In this paper we systematically compare different granularities of decomposition -- from document to sub-sentence level, and we show that the answer is no. Our results show that incorporating additional context can yield improvement, but that this does not necessarily apply to all datasets. We also show that small changes to previously proposed entailment-based scoring methods can result in better performance, highlighting the need for caution in model and methodology selection for downstream tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题