论文标题

多域神经机器翻译的域特异性子网络

Domain Specific Sub-network for Multi-Domain Neural Machine Translation

论文作者

Hendy, Amr, Abdelghaffar, Mohamed, Afify, Mohamed, Tawfik, Ahmed Y.

论文摘要

本文介绍了特定领域的子网络(DOSS)。它使用通过修剪获得的一组掩码来定义每个域的子网络,并在域数据上列出子网络参数。与对每个域上的整个网络相比,这可以非常紧密和大大减少参数的数量。还提出了一种使每个域使掩模与众不同的方法,并证明可以大大提高看不见的域的概括。在我们在德语到英语机器翻译的实验中,提出的方法的强大基准的强大基准是在多域(医学,技术和宗教)数据上的培训,比1.47 BLEU点。还要继续对新领域(法律)的培训DOSS优于1.52 BLEU点多域(医学,技术,宗教,法律)基线。

This paper presents Domain-Specific Sub-network (DoSS). It uses a set of masks obtained through pruning to define a sub-network for each domain and finetunes the sub-network parameters on domain data. This performs very closely and drastically reduces the number of parameters compared to finetuning the whole network on each domain. Also a method to make masks unique per domain is proposed and shown to greatly improve the generalization to unseen domains. In our experiments on German to English machine translation the proposed method outperforms the strong baseline of continue training on multi-domain (medical, tech and religion) data by 1.47 BLEU points. Also continue training DoSS on new domain (legal) outperforms the multi-domain (medical, tech, religion, legal) baseline by 1.52 BLEU points.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源