facts2story：通过关键事实控制文本生成

论文标题

facts2story：通过关键事实控制文本生成

Facts2Story: Controlling Text Generation by Key Facts

论文作者

Orbach, Eyal, Goldberg, Yoav

论文摘要

自发神经网络体系结构的最新进展提高了开放式文本生成的标准。但是，尽管当前的方法能够产生一个连贯的文本，该文本长达数百个单词，可以控制正在生成的内容（以及对其进行评估）仍然是空的问题。我们提出了一项受控的生成任务，该任务基于将自然语言表达的一系列事实序列扩展到更长的叙述中。我们为此任务介绍了基于人类的评估指标，以及一种推导大型培训数据集的方法。我们基于微调预训练的模型评估了有关此任务的三种方法。我们表明，虽然自动回归，单向语言模型（例如GPT2）会产生更好的流利性，但它们努力遵守所要求的事实。我们提出了一个计划与薄饼模型（使用微型XLNET），该模型在遵守请求的内容的同时产生竞争性的流利度。

Recent advancements in self-attention neural network architectures have raised the bar for open-ended text generation. Yet, while current methods are capable of producing a coherent text which is several hundred words long, attaining control over the content that is being generated -- as well as evaluating it -- are still open questions. We propose a controlled generation task which is based on expanding a sequence of facts, expressed in natural language, into a longer narrative. We introduce human-based evaluation metrics for this task, as well as a method for deriving a large training dataset. We evaluate three methods on this task, based on fine-tuning pre-trained models. We show that while auto-regressive, unidirectional Language Models such as GPT2 produce better fluency, they struggle to adhere to the requested facts. We propose a plan-and-cloze model (using fine-tuned XLNet) which produces competitive fluency while adhering to the requested content.

下载PDF全文

下载文献需遵守相关版权规定

论文标题