扩展单词级质量估算以进行后编辑帮助

论文标题

扩展单词级质量估算以进行后编辑帮助

Extending Word-Level Quality Estimation for Post-Editing Assistance

论文作者

Wei, Yizhen, Utsuro, Takehito, Nagata, Masaaki

论文摘要

我们定义了一个名为“扩展单词对齐”的新颖概念，以提高后编辑辅助效率。基于扩展的单词对齐方式，我们进一步提出了一个名为精致的单词量量化量子的新颖任务，该任务输出精制标签和文字级别的对应关系。与原始单词级别的量化宽松相比，新任务能够直接指出编辑操作，从而提高效率。为了提取扩展单词对齐，我们采用了基于Mbert的监督方法。为了求解精致的单词量量化宽松，我们首先通过训练基于MBERT和XLM-R的序列标记的回归模型来预测原始量化量子标签。然后，我们以扩展单词对齐方式完善原始字标签。另外，我们提取源空隙对应关系，同时获得GAP标签。两种语言对的实验显示了我们方法的可行性，并为我们提供了进一步改进的灵感。

We define a novel concept called extended word alignment in order to improve post-editing assistance efficiency. Based on extended word alignment, we further propose a novel task called refined word-level QE that outputs refined tags and word-level correspondences. Compared to original word-level QE, the new task is able to directly point out editing operations, thus improves efficiency. To extract extended word alignment, we adopt a supervised method based on mBERT. To solve refined word-level QE, we firstly predict original QE tags by training a regression model for sequence tagging based on mBERT and XLM-R. Then, we refine original word tags with extended word alignment. In addition, we extract source-gap correspondences, meanwhile, obtaining gap tags. Experiments on two language pairs show the feasibility of our method and give us inspirations for further improvement.

下载PDF全文

下载文献需遵守相关版权规定

论文标题