论文标题

可以在多种语言神经机器翻译中跨语言转移域吗?

Can Domains Be Transferred Across Languages in Multi-Domain Multilingual Neural Machine Translation?

论文作者

Vu, Thuy-Trang, Khadivi, Shahram, He, Xuanli, Phung, Dinh, Haffari, Gholamreza

论文摘要

先前的作品主要集中在神经机器翻译(NMT)的多语言或多域方面。本文研究了域信息是否可以从语言上传递有关多域和多语言NMT组成的语言,尤其是对于某些语言对缺失的不完整数据条件。我们在精选的剩余实验实验中的结果表明,多域多语言(MDML)NMT可以提高零射击性能在BLEU上的增长+10,并有助于将多域NMT的概括为失踪域。我们还探讨了有效整合多语言和多域NMT的策略,包括语言和域标签组合以及辅助任务培训。我们发现学习域感知表示并在编码器中添加目标语言标签会导致有效的MDML-NMT。

Previous works mostly focus on either multilingual or multi-domain aspects of neural machine translation (NMT). This paper investigates whether the domain information can be transferred across languages on the composition of multi-domain and multilingual NMT, particularly for the incomplete data condition where in-domain bitext is missing for some language pairs. Our results in the curated leave-one-domain-out experiments show that multi-domain multilingual (MDML) NMT can boost zero-shot translation performance up to +10 gains on BLEU, as well as aid the generalisation of multi-domain NMT to the missing domain. We also explore strategies for effective integration of multilingual and multi-domain NMT, including language and domain tag combination and auxiliary task training. We find that learning domain-aware representations and adding target-language tags to the encoder leads to effective MDML-NMT.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源