Ernie-Gen：一种增强的自然语言生成的多流训练和微调框架

论文标题

Ernie-Gen：一种增强的自然语言生成的多流训练和微调框架

ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation

论文作者

Xiao, Dongling, Zhang, Han, Li, Yukun, Sun, Yu, Tian, Hao, Wu, Hua, Wang, Haifeng

论文摘要

当前的自然语言生成训练作品几乎不关注下游任务的暴露偏见问题。为了解决这个问题，我们提出了一个增强的多流序列，以序列预训练和微调框架，名为Ernie-Gen，该框架桥接了训练和推断与填充生成机制和噪声引用的生成方法之间的差异。为了使生成更接近人类的写作模式，该框架引入了跨跨的一代流，该流程训练模型以连续预测语义完整的跨度，而不是通过单词预测。与现有的训练方法不同，Ernie-Gen结合了多粒性目标采样以构建训练前数据，从而增强了编码器与解码器之间的相关性。实验结果表明，Ernie-Gen在一系列语言生成任务上的预训练数据和参数少得多，包括抽象性摘要（Gigaword和CNN/DailyMail），问题产生（Squead），对话（Persona-Chat）（角色chat）以及生成的问题回答（COQA）。

Current pre-training works in natural language generation pay little attention to the problem of exposure bias on downstream tasks. To address this issue, we propose an enhanced multi-flow sequence to sequence pre-training and fine-tuning framework named ERNIE-GEN, which bridges the discrepancy between training and inference with an infilling generation mechanism and a noise-aware generation method. To make generation closer to human writing patterns, this framework introduces a span-by-span generation flow that trains the model to predict semantically-complete spans consecutively rather than predicting word by word. Unlike existing pre-training methods, ERNIE-GEN incorporates multi-granularity target sampling to construct pre-training data, which enhances the correlation between encoder and decoder. Experimental results demonstrate that ERNIE-GEN achieves state-of-the-art results with a much smaller amount of pre-training data and parameters on a range of language generation tasks, including abstractive summarization (Gigaword and CNN/DailyMail), question generation (SQuAD), dialogue generation (Persona-Chat) and generative question answering (CoQA).

下载PDF全文

下载文献需遵守相关版权规定

论文标题