在机器翻译中检测单词感觉的歧义偏见，用于模型不合时宜的对抗攻击

论文标题

在机器翻译中检测单词感觉的歧义偏见，用于模型不合时宜的对抗攻击

Detecting Word Sense Disambiguation Biases in Machine Translation for Model-Agnostic Adversarial Attacks

论文作者

Emelin, Denis, Titov, Ivan, Sennrich, Rico

论文摘要

单词感官歧义是NMT翻译错误的众所周知的来源。我们认为，某些不正确的歧义选择是由于模型过度依赖对训练数据中发现的数据集文物的过度依赖，特别是浅表单词共发生，而不是对源文本的更深入的了解。我们介绍了一种基于统计数据属性预测歧义错误的方法，证明了其在几种领域和模型类型之间的有效性。此外，我们制定了一种简单的对抗性攻击策略，该策略最少地散布句子，以引起歧义错误，以进一步探讨翻译模型的鲁棒性。我们的发现表明，在域之间的歧义鲁棒性在域之间有很大变化，并且在相同数据上训练的不同模型容易受到不同攻击的影响。

Word sense disambiguation is a well-known source of translation errors in NMT. We posit that some of the incorrect disambiguation choices are due to models' over-reliance on dataset artifacts found in training data, specifically superficial word co-occurrences, rather than a deeper understanding of the source text. We introduce a method for the prediction of disambiguation errors based on statistical data properties, demonstrating its effectiveness across several domains and model types. Moreover, we develop a simple adversarial attack strategy that minimally perturbs sentences in order to elicit disambiguation errors to further probe the robustness of translation models. Our findings indicate that disambiguation robustness varies substantially between domains and that different models trained on the same data are vulnerable to different attacks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题