论文标题

合成翻译可以提高bitext质量吗?

Can Synthetic Translations Improve Bitext Quality?

论文作者

Briakou, Eleftheria, Carpuat, Marine

论文摘要

合成翻译已用于广泛的NLP任务,主要是作为数据增强的手段。相反,这项工作探讨了如何使用合成翻译来修改挖掘bitext中潜在不完美的参考翻译。我们发现,合成样品可以在基于语义等效分类器替换原始作品时,没有任何其他双语监督,而无需任何其他双语监督,从而有助于减轻NMT噪声。通过人类评估和外在的双语诱导和MT任务,可以在本质上进行内在的评估确认修订的bitext的提高质量。

Synthetic translations have been used for a wide range of NLP tasks primarily as a means of data augmentation. This work explores, instead, how synthetic translations can be used to revise potentially imperfect reference translations in mined bitext. We find that synthetic samples can improve bitext quality without any additional bilingual supervision when they replace the originals based on a semantic equivalence classifier that helps mitigate NMT noise. The improved quality of the revised bitext is confirmed intrinsically via human evaluation and extrinsically through bilingual induction and MT tasks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源