论文标题
低资源语言的不匹配的无监督翻译质量估计
Mismatching-Aware Unsupervised Translation Quality Estimation For Low-Resource Languages
论文作者
论文摘要
翻译质量估计(QE)是预测机器翻译质量(MT)输出的任务,而无需任何参考。作为MT实际应用中的重要组成部分,这项任务已越来越受到关注。在本文中,我们首先提出了XLMRScore,这是通过XLM-Roberta(XLMR)模型计算的BertScore的跨语性对应物。该指标可以用作一种简单的无监督量化QE方法,但是面临两个问题:首先,未经翻译的代币导致了出乎意料的高翻译分数,其次,在XLMRScore中应用贪婪匹配时,源和假说代币之间的错误不匹配错误。为了减轻这些问题,我们建议用未知的令牌和预训练模型的跨语性对齐方式代替未翻译的单词,以分别表示彼此之间的一致词分别表示。我们评估了WMT21 QE共享任务的四个低资源语言对以及本文介绍的新的英语$ \ rightarrow $波斯(EN-FA)测试数据集的提议方法。实验表明,我们的方法可以在两个零射击方案(即Pearson相关性差异少于0.01)的基线中获得可比的结果,同时在所有低资源语言对中的无监督竞争对手的平均水平超过8%,而平均差不多。
Translation Quality Estimation (QE) is the task of predicting the quality of machine translation (MT) output without any reference. This task has gained increasing attention as an important component in the practical applications of MT. In this paper, we first propose XLMRScore, which is a cross-lingual counterpart of BERTScore computed via the XLM-RoBERTa (XLMR) model. This metric can be used as a simple unsupervised QE method, nevertheless facing two issues: firstly, the untranslated tokens leading to unexpectedly high translation scores, and secondly, the issue of mismatching errors between source and hypothesis tokens when applying the greedy matching in XLMRScore. To mitigate these issues, we suggest replacing untranslated words with the unknown token and the cross-lingual alignment of the pre-trained model to represent aligned words closer to each other, respectively. We evaluate the proposed method on four low-resource language pairs of the WMT21 QE shared task, as well as a new English$\rightarrow$Persian (En-Fa) test dataset introduced in this paper. Experiments show that our method could get comparable results with the supervised baseline for two zero-shot scenarios, i.e., with less than 0.01 difference in Pearson correlation, while outperforming unsupervised rivals in all the low-resource language pairs for above 8%, on average.