论文标题
facts2story:通过关键事实控制文本生成
Facts2Story: Controlling Text Generation by Key Facts
论文作者
论文摘要
自发神经网络体系结构的最新进展提高了开放式文本生成的标准。但是,尽管当前的方法能够产生一个连贯的文本,该文本长达数百个单词,可以控制正在生成的内容(以及对其进行评估)仍然是空的问题。我们提出了一项受控的生成任务,该任务基于将自然语言表达的一系列事实序列扩展到更长的叙述中。我们为此任务介绍了基于人类的评估指标,以及一种推导大型培训数据集的方法。我们基于微调预训练的模型评估了有关此任务的三种方法。我们表明,虽然自动回归,单向语言模型(例如GPT2)会产生更好的流利性,但它们努力遵守所要求的事实。我们提出了一个计划与薄饼模型(使用微型XLNET),该模型在遵守请求的内容的同时产生竞争性的流利度。
Recent advancements in self-attention neural network architectures have raised the bar for open-ended text generation. Yet, while current methods are capable of producing a coherent text which is several hundred words long, attaining control over the content that is being generated -- as well as evaluating it -- are still open questions. We propose a controlled generation task which is based on expanding a sequence of facts, expressed in natural language, into a longer narrative. We introduce human-based evaluation metrics for this task, as well as a method for deriving a large training dataset. We evaluate three methods on this task, based on fine-tuning pre-trained models. We show that while auto-regressive, unidirectional Language Models such as GPT2 produce better fluency, they struggle to adhere to the requested facts. We propose a plan-and-cloze model (using fine-tuned XLNet) which produces competitive fluency while adhering to the requested content.