MT-GENEVAL：一种反事实和上下文数据集，用于评估机器翻译中的性别精度

论文标题

MT-GENEVAL：一种反事实和上下文数据集，用于评估机器翻译中的性别精度

MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation

论文作者

Currey, Anna, Nădejde, Maria, Pappagari, Raghavendra, Mayer, Mia, Lauly, Stanislas, Niu, Xing, Hsu, Benjamin, Dinu, Georgiana

论文摘要

随着通用机器翻译（MT）质量的提高，对探索质量细粒方面的目标基准的需求增加了。特别是，翻译的性别准确性在输出流利度，翻译准确性和伦理方面具有影响。在本文中，我们介绍了MT-Geneval，这是一种基准，用于评估从英语翻译成八种广泛口语的性别准确性。 MT-Geneval通过在八对语言对中提供现实，性别平衡的反事实数据来补充现有的基准，在该语言对中，个人的性别在输入部分中是明确的，包括需要多句子段，需要宽性性别一致。我们的数据和代码可通过SA 3.0许可证公开获得。

As generic machine translation (MT) quality has improved, the need for targeted benchmarks that explore fine-grained aspects of quality has increased. In particular, gender accuracy in translation can have implications in terms of output fluency, translation accuracy, and ethics. In this paper, we introduce MT-GenEval, a benchmark for evaluating gender accuracy in translation from English into eight widely-spoken languages. MT-GenEval complements existing benchmarks by providing realistic, gender-balanced, counterfactual data in eight language pairs where the gender of individuals is unambiguous in the input segment, including multi-sentence segments requiring inter-sentential gender agreement. Our data and code is publicly available under a CC BY SA 3.0 license.

下载PDF全文

下载文献需遵守相关版权规定

论文标题