论文标题
自我监督的表示形式改善了端到端的语音翻译
Self-Supervised Representations Improve End-to-End Speech Translation
论文作者
论文摘要
端到端的语音到文本翻译可以提供更简单,更小的系统,但面临数据稀缺的挑战。培训预培训方法可以利用未标记的数据,并已证明对数据筛选设置有效。在这项工作中,我们探讨了自我监管的预训练的语音表示形式是否可以使高水回设置和低资源设置中的语音翻译任务受益,它们是否可以很好地转移到其他语言中,以及是否可以有效地将它们与其他常见的方法结合在一起,以帮助改善端对端的语音翻译,例如使用预先培训的高训练的高额言语识别系统。我们证明,自我监管的预训练的功能可以始终如一地提高翻译性能,跨语言转移可以扩展到无需或很少调整的各种语言。
End-to-end speech-to-text translation can provide a simpler and smaller system but is facing the challenge of data scarcity. Pre-training methods can leverage unlabeled data and have been shown to be effective on data-scarce settings. In this work, we explore whether self-supervised pre-trained speech representations can benefit the speech translation task in both high- and low-resource settings, whether they can transfer well to other languages, and whether they can be effectively combined with other common methods that help improve low-resource end-to-end speech translation such as using a pre-trained high-resource speech recognition system. We demonstrate that self-supervised pre-trained features can consistently improve the translation performance, and cross-lingual transfer allows to extend to a variety of languages without or with little tuning.