Dialogen：对话系统的广义远程上下文表示

论文标题

Dialogen：对话系统的广义远程上下文表示

DialoGen: Generalized Long-Range Context Representation for Dialogue Systems

论文作者

Dey, Suvodip, Desarkar, Maunendra Sankar, Ekbal, Asif, Srijith, P. K.

论文摘要

远程上下文建模对于对话理解和产生至关重要。对话上下文表示的最流行方法是按时间顺序排列最后的$ k $话语。但是，此方法可能不是包含长期依赖性的对话的理想选择，即，当需要超越最后一个$ k $的话语以产生有意义的响应时。在这项工作中，我们提出了Dialogen，这是一种基于编码器的新型框架，用于对话生成，具有广义上下文表示，可以超越最后的$ k $ upstress。我们方法的主要思想是识别和利用最相关的历史话语，而不是最后$ k $，这也使对话历史的紧凑表示具有更少的令牌。我们研究了我们提出的方法对对话产生（开放域）和理解（DST）的有效性。即使具有紧凑的上下文表示，Dialogen也与开放域DailyDialog数据集上的最新模型相当地执行。当将提出的上下文表示应用于现有的DST模型时，我们在多WOZ数据集的DST任务上观察到类似的行为。我们还讨论了Dialogen的可推广性和解释性，并表明先前话语的相关性得分与人类认知非常吻合。

Long-range context modeling is crucial to both dialogue understanding and generation. The most popular method for dialogue context representation is to concatenate the last-$k$ utterances in chronological order. However, this method may not be ideal for conversations containing long-range dependencies, i.e., when there is a need to look beyond last-$k$ utterances to generate a meaningful response. In this work, we propose DialoGen, a novel encoder-decoder based framework for dialogue generation with a generalized context representation that can look beyond the last-$k$ utterances. The main idea of our approach is to identify and utilize the most relevant historical utterances instead of last-$k$, which also enables the compact representation of dialogue history with fewer tokens. We study the effectiveness of our proposed method on both dialogue generation (open-domain) and understanding (DST). Even with a compact context representation, DialoGen performs comparably to the state-of-the-art models on the open-domain DailyDialog dataset. We observe a similar behavior on the DST task of the MultiWOZ dataset when the proposed context representation is applied to existing DST models. We also discuss the generalizability and interpretability of DialoGen and show that the relevance score of previous utterances agrees well with human cognition.

下载PDF全文

下载文献需遵守相关版权规定

论文标题