论文标题
通过学生强化的最佳运输来改善文本生成
Improving Text Generation with Student-Forcing Optimal Transport
论文作者
论文摘要
神经语言模型通常经过最大似然估计(MLE)的训练,其中下一个单词是在地面词令牌上生成的。但是,在测试过程中,该模型是基于先前生成的令牌的条件,从而导致了所谓的暴露偏见。为了减少训练和测试之间的差距,我们建议使用最佳传输(OT)匹配这两种模式中生成的序列。进一步提出了扩展,以根据文本序列的结构和上下文信息来改善OT学习。提出的方法的有效性在机器翻译,文本摘要和文本生成任务上得到了验证。
Neural language models are often trained with maximum likelihood estimation (MLE), where the next word is generated conditioned on the ground-truth word tokens. During testing, however, the model is instead conditioned on previously generated tokens, resulting in what is termed exposure bias. To reduce this gap between training and testing, we propose using optimal transport (OT) to match the sequences generated in these two modes. An extension is further proposed to improve the OT learning, based on the structural and contextual information of the text sequences. The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.