论文标题

性别处于危险之中?评估言语翻译技术

Gender in Danger? Evaluating Speech Translation Technology on the MuST-SHE Corpus

论文作者

Bentivogli, Luisa, Savoldi, Beatrice, Negri, Matteo, Di Gangi, Mattia Antonino, Cattoni, Roldano, Turchi, Marco

论文摘要

从没有富有生产力的语法性别(如英语)为性别标记的语言的语言转化为机器的众所周知的困难。这种困难也是由于建立模型的训练数据通常反映了自然语言的不对称性,包括性别偏见。机器翻译专门用文本数据喂养,在本质上受到以下事实的限制:输入句子并不总是包含有关引用人类实体的性别认同的线索。但是,语音翻译会发生什么,输入是音频信号?音频可以提供其他信息以减少性别偏见吗?我们介绍了对语音翻译中性别偏见的首次彻底调查,其中贡献了:i)在两个语言方向(英语 - 意大利语/法语)上的不同技术(级联和端到端)的不同技术(级联和端到端)的比较。

Translating from languages without productive grammatical gender like English into gender-marked languages is a well-known difficulty for machines. This difficulty is also due to the fact that the training data on which models are built typically reflect the asymmetries of natural languages, gender bias included. Exclusively fed with textual data, machine translation is intrinsically constrained by the fact that the input sentence does not always contain clues about the gender identity of the referred human entities. But what happens with speech translation, where the input is an audio signal? Can audio provide additional information to reduce gender bias? We present the first thorough investigation of gender bias in speech translation, contributing with: i) the release of a benchmark useful for future studies, and ii) the comparison of different technologies (cascade and end-to-end) on two language directions (English-Italian/French).

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源