自动机器翻译评估以许多语言通过零拍释义

论文标题

自动机器翻译评估以许多语言通过零拍释义

Automatic Machine Translation Evaluation in Many Languages via Zero-Shot Paraphrasing

论文作者

Thompson, Brian, Post, Matt

论文摘要

我们将机器翻译评估的任务框起来是以人类参考为条件的序列到序列释义器的评分机转换输出之一。我们建议培训释义器作为多语言NMT系统，将释义视为零击的翻译任务（例如，捷克到捷克语）。这导致释义者的输出模式以输入序列的副本为中心，这代表了MT系统输出匹配人类参考的最佳情况。我们的方法简单明了，不需要人类的判断进行培训。我们的单个模型（经过39种语言培训）的表现优于WMT 2019细分级别的所有先前指标的统计学与所有语言的共享指标任务（不包括Gujarati，该模型没有培训数据）。我们还使用我们的模型来探索质量估算的任务，作为对源的指标，而不是参考，并发现它在每种语言对中质量估计的共享任务上大大优于提交的所有提交。

We frame the task of machine translation evaluation as one of scoring machine translation output with a sequence-to-sequence paraphraser, conditioned on a human reference. We propose training the paraphraser as a multilingual NMT system, treating paraphrasing as a zero-shot translation task (e.g., Czech to Czech). This results in the paraphraser's output mode being centered around a copy of the input sequence, which represents the best case scenario where the MT system output matches a human reference. Our method is simple and intuitive, and does not require human judgements for training. Our single model (trained in 39 languages) outperforms or statistically ties with all prior metrics on the WMT 2019 segment-level shared metrics task in all languages (excluding Gujarati where the model had no training data). We also explore using our model for the task of quality estimation as a metric--conditioning on the source instead of the reference--and find that it significantly outperforms every submission to the WMT 2019 shared task on quality estimation in every language pair.

下载PDF全文

下载文献需遵守相关版权规定

论文标题