论文标题
非洲语言的大型词汇识别:多语言建模和自学学习
Large vocabulary speech recognition for languages of Africa: multilingual modeling and self-supervised learning
论文作者
论文摘要
在非洲使用的2,000多种语言几乎都没有广泛可用的自动语音识别系统,并且所需的数据也仅适用于几种语言。我们已经尝试了两种技术,这些技术可能为非洲语言提供大型词汇识别的途径:多语言建模和自我监督的学习。我们收集了可用的开源数据并收集了15种语言的数据,并使用这些技术训练了实验模型。我们的结果表明,在多语言端到端模型中汇总可用的少量数据,并且对无监督的数据进行预培训可以帮助提高许多非洲语言的语音识别质量。
Almost none of the 2,000+ languages spoken in Africa have widely available automatic speech recognition systems, and the required data is also only available for a few languages. We have experimented with two techniques which may provide pathways to large vocabulary speech recognition for African languages: multilingual modeling and self-supervised learning. We gathered available open source data and collected data for 15 languages, and trained experimental models using these techniques. Our results show that pooling the small amounts of data available in multilingual end-to-end models, and pre-training on unsupervised data can help improve speech recognition quality for many African languages.