改进的数据增强以进行翻译建议

论文标题

改进的数据增强以进行翻译建议

Improved Data Augmentation for Translation Suggestion

论文作者

Zhang, Hongxiao, Lai, Siyu, Zhang, Songming, Huang, Hui, Chen, Yufeng, Xu, Jinan, Liu, Jian

论文摘要

翻译建议（TS）模型用于自动为机器翻译生成的句子中的跨度提供替代建议。本文将我们在提交的系统中介绍了WMT'22翻译建议共享任务。我们的系统基于不同翻译体系结构的合奏，包括变压器，SA-Transformer和DynamicConv。我们使用三种策略来构建来自平行语料库的综合数据，以弥补缺乏监督数据。此外，我们引入了多相预训练策略，并增加了一个具有内域数据的预训练阶段。我们分别在英语 - 德国和英语双向任务中排名第二和第三。

Translation suggestion (TS) models are used to automatically provide alternative suggestions for incorrect spans in sentences generated by machine translation. This paper introduces the system used in our submission to the WMT'22 Translation Suggestion shared task. Our system is based on the ensemble of different translation architectures, including Transformer, SA-Transformer, and DynamicConv. We use three strategies to construct synthetic data from parallel corpora to compensate for the lack of supervised data. In addition, we introduce a multi-phase pre-training strategy, adding an additional pre-training phase with in-domain data. We rank second and third on the English-German and English-Chinese bidirectional tasks, respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题