各种响应生成的变化变压器

论文标题

各种响应生成的变化变压器

Variational Transformers for Diverse Response Generation

论文作者

Lin, Zhaojiang, Winata, Genta Indra, Xu, Peng, Liu, Zihan, Fung, Pascale

论文摘要

尽管在许多序列建模任务（例如机器翻译）中有变形金刚的巨大希望，但它们的确定性性质仍阻碍了他们从概括到高熵任务（例如对话响应生成）。先前的工作建议通过基于经常性神经网络（RNN）的条件变异自动编码器（CVAE）来捕获对话响应的可变性。但是，RNN的自回旋计算限制了训练效率。因此，我们提出了变分变压器（VT），这是一种变异的自我攻击前序序列模型。 VT通过将随机的潜在变量纳入变压器，将变压器的并行性和全局感受场与CVAE的变化性结合在一起。我们探索VT的两种类型：1）用全球潜在变量对话语级别的多样性进行建模； 2）用一系列细粒潜在变量来增强变压器解码器。然后，在三个具有自动公制和人类评估的对话数据集上评估所提出的模型。实验结果表明，在多样性，语义相关性和人类判断方面，我们的模型改善了标准变压器和其他基线。

Despite the great promise of Transformers in many sequence modeling tasks (e.g., machine translation), their deterministic nature hinders them from generalizing to high entropy tasks such as dialogue response generation. Previous work proposes to capture the variability of dialogue responses with a recurrent neural network (RNN)-based conditional variational autoencoder (CVAE). However, the autoregressive computation of the RNN limits the training efficiency. Therefore, we propose the Variational Transformer (VT), a variational self-attentive feed-forward sequence model. The VT combines the parallelizability and global receptive field of the Transformer with the variational nature of the CVAE by incorporating stochastic latent variables into Transformers. We explore two types of the VT: 1) modeling the discourse-level diversity with a global latent variable; and 2) augmenting the Transformer decoder with a sequence of fine-grained latent variables. Then, the proposed models are evaluated on three conversational datasets with both automatic metric and human evaluation. The experimental results show that our models improve standard Transformers and other baselines in terms of diversity, semantic relevance, and human judgment.

下载PDF全文

下载文献需遵守相关版权规定

论文标题