论文标题
扩展单词级质量估算以进行后编辑帮助
Extending Word-Level Quality Estimation for Post-Editing Assistance
论文作者
论文摘要
我们定义了一个名为“扩展单词对齐”的新颖概念,以提高后编辑辅助效率。基于扩展的单词对齐方式,我们进一步提出了一个名为精致的单词量量化量子的新颖任务,该任务输出精制标签和文字级别的对应关系。与原始单词级别的量化宽松相比,新任务能够直接指出编辑操作,从而提高效率。为了提取扩展单词对齐,我们采用了基于Mbert的监督方法。为了求解精致的单词量量化宽松,我们首先通过训练基于MBERT和XLM-R的序列标记的回归模型来预测原始量化量子标签。然后,我们以扩展单词对齐方式完善原始字标签。另外,我们提取源空隙对应关系,同时获得GAP标签。两种语言对的实验显示了我们方法的可行性,并为我们提供了进一步改进的灵感。
We define a novel concept called extended word alignment in order to improve post-editing assistance efficiency. Based on extended word alignment, we further propose a novel task called refined word-level QE that outputs refined tags and word-level correspondences. Compared to original word-level QE, the new task is able to directly point out editing operations, thus improves efficiency. To extract extended word alignment, we adopt a supervised method based on mBERT. To solve refined word-level QE, we firstly predict original QE tags by training a regression model for sequence tagging based on mBERT and XLM-R. Then, we refine original word tags with extended word alignment. In addition, we extract source-gap correspondences, meanwhile, obtaining gap tags. Experiments on two language pairs show the feasibility of our method and give us inspirations for further improvement.