论文标题
MT-GENEVAL:一种反事实和上下文数据集,用于评估机器翻译中的性别精度
MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation
论文作者
论文摘要
随着通用机器翻译(MT)质量的提高,对探索质量细粒方面的目标基准的需求增加了。特别是,翻译的性别准确性在输出流利度,翻译准确性和伦理方面具有影响。在本文中,我们介绍了MT-Geneval,这是一种基准,用于评估从英语翻译成八种广泛口语的性别准确性。 MT-Geneval通过在八对语言对中提供现实,性别平衡的反事实数据来补充现有的基准,在该语言对中,个人的性别在输入部分中是明确的,包括需要多句子段,需要宽性性别一致。我们的数据和代码可通过SA 3.0许可证公开获得。
As generic machine translation (MT) quality has improved, the need for targeted benchmarks that explore fine-grained aspects of quality has increased. In particular, gender accuracy in translation can have implications in terms of output fluency, translation accuracy, and ethics. In this paper, we introduce MT-GenEval, a benchmark for evaluating gender accuracy in translation from English into eight widely-spoken languages. MT-GenEval complements existing benchmarks by providing realistic, gender-balanced, counterfactual data in eight language pairs where the gender of individuals is unambiguous in the input segment, including multi-sentence segments requiring inter-sentential gender agreement. Our data and code is publicly available under a CC BY SA 3.0 license.