预审前的语言模型，用于与多个输入源的对话生成

论文标题

预审前的语言模型，用于与多个输入源的对话生成

Pretrained Language Models for Dialogue Generation with Multiple Input Sources

论文作者

Cao, Yu, Bi, Wei, Fang, Meng, Tao, Dacheng

论文摘要

大规模的语言模型在自然语言理解任务上取得了出色的表现。但是，它仍在调查如何将它们应用于对话生成任务，尤其是那些以多种来源为条件的响应的任务。先前的工作简单地将所有输入源或来自不同输入源的信息加成。在这项工作中，我们研究了对话模型，这些对话模型具有由验证的语言模型GPT2改编的多个输入源。我们探索各种方法，以融合与不同来源相对应的多个单独的注意信息。我们的实验结果表明，与简单的融合基线相比，适当的融合方法与对话历史的相关性更高。

Large-scale pretrained language models have achieved outstanding performance on natural language understanding tasks. However, it is still under investigating how to apply them to dialogue generation tasks, especially those with responses conditioned on multiple sources. Previous work simply concatenates all input sources or averages information from different input sources. In this work, we study dialogue models with multiple input sources adapted from the pretrained language model GPT2. We explore various methods to fuse multiple separate attention information corresponding to different sources. Our experimental results show that proper fusion methods deliver higher relevance with dialogue history than simple fusion baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题