非自动回归机器翻译中句法多模式的研究

论文标题

非自动回归机器翻译中句法多模式的研究

A Study of Syntactic Multi-Modality in Non-Autoregressive Machine Translation

论文作者

Zhang, Kexun, Wang, Rui, Tan, Xu, Guo, Junliang, Ren, Yi, Qin, Tao, Liu, Tie-Yan

论文摘要

由于其有条件的独立性假设，非自动回旋翻译（NAT）模型很难捕获目标翻译的多模式分布，这被称为“多模式问题”，包括词汇多模式和义面的多模式。虽然对第一个进行了充分的研究，但句法多模式性为NAT的标准横熵（XE）损失带来了严重的挑战，并且正在研究。在本文中，我们对句法多模式问题进行了系统研究。具体而言，我们将其分解为短期和远程句法多模式，并在精心设计的合成数据集和实际数据集上评估了具有高级损耗功能的最近的几种NAT算法。我们发现，连接派时间分类（CTC）损失和订单不可或缺的横向熵（OAXE）损失可以更好地处理短期和远程句法多模式。此外，我们将同时掌握并设计新的损失功能，以更好地处理现实世界中的复杂句法多模式。为了促进实际用法，我们提供了一个指南，以使用不同种类的句法多模式使用不同的损失功能。

It is difficult for non-autoregressive translation (NAT) models to capture the multi-modal distribution of target translations due to their conditional independence assumption, which is known as the "multi-modality problem", including the lexical multi-modality and the syntactic multi-modality. While the first one has been well studied, the syntactic multi-modality brings severe challenge to the standard cross entropy (XE) loss in NAT and is under studied. In this paper, we conduct a systematic study on the syntactic multi-modality problem. Specifically, we decompose it into short- and long-range syntactic multi-modalities and evaluate several recent NAT algorithms with advanced loss functions on both carefully designed synthesized datasets and real datasets. We find that the Connectionist Temporal Classification (CTC) loss and the Order-Agnostic Cross Entropy (OAXE) loss can better handle short- and long-range syntactic multi-modalities respectively. Furthermore, we take the best of both and design a new loss function to better handle the complicated syntactic multi-modality in real-world datasets. To facilitate practical usage, we provide a guide to use different loss functions for different kinds of syntactic multi-modality.

下载PDF全文

下载文献需遵守相关版权规定

论文标题