论文标题
与双向相互作用的半监督双语词典感应
Semi-Supervised Bilingual Lexicon Induction with Two-way Interaction
论文作者
论文摘要
半佩维斯是双语词典诱导(BLI)的有希望的范式,注释有限。但是,以前的半监视方法并不能完全利用带注释和未经通知数据中隐藏的知识,从而阻碍了其性能的进一步改善。在本文中,我们提出了一个新的半监督BLI框架,以鼓励监督信号与无监督的对齐之间的相互作用。我们设计了两种消息通知机制,以分别指定的带注释的数据和非注销数据之间的知识,分别命名为“最佳传输和双向词典更新”。然后,我们基于循环或并行参数喂养程序进行半监督学习,以更新我们的模型。我们的框架是一个通用框架,可以基于最佳运输来结合任何受监督和无监督的BLI方法。 Muse和Vecmap数据集的实验结果显示了我们的模型的显着改善。消融研究还证明,监督信号与无监督的一致性之间的双向相互作用涉及总体绩效的增长。遥远语言对的结果进一步说明了我们提出的方法的优势和鲁棒性。
Semi-supervision is a promising paradigm for Bilingual Lexicon Induction (BLI) with limited annotations. However, previous semisupervised methods do not fully utilize the knowledge hidden in annotated and nonannotated data, which hinders further improvement of their performance. In this paper, we propose a new semi-supervised BLI framework to encourage the interaction between the supervised signal and unsupervised alignment. We design two message-passing mechanisms to transfer knowledge between annotated and non-annotated data, named prior optimal transport and bi-directional lexicon update respectively. Then, we perform semi-supervised learning based on a cyclic or a parallel parameter feeding routine to update our models. Our framework is a general framework that can incorporate any supervised and unsupervised BLI methods based on optimal transport. Experimental results on MUSE and VecMap datasets show significant improvement of our models. Ablation study also proves that the two-way interaction between the supervised signal and unsupervised alignment accounts for the gain of the overall performance. Results on distant language pairs further illustrate the advantage and robustness of our proposed method.