论文标题

学习ASR途径:稀疏的多语言ASR模型

Learning ASR pathways: A sparse multilingual ASR model

论文作者

Yang, Mu, Tjandra, Andros, Liu, Chunxi, Zhang, David, Le, Duc, Kalinli, Ozlem

论文摘要

神经网络修剪可以有效地压缩自动语音识别(ASR)模型。但是,在多语言ASR中,语言不足的修剪可能会导致某些语言的严重性能下降,因为语言不合时宜的修剪口罩可能不符合所有语言,并丢弃了重要的语言特定参数。在这项工作中,我们提出了ASR路径,这是一种稀疏的多语言ASR模型,该模型激活了特定语言的子网络(“路径”),从而明确地学习了每种语言的参数。通过重叠的子网络,共享参数还可以通过联合多语言培训来实现低资源语言的知识转移。我们提出了一种新型算法来学习ASR途径,并使用流式RNN-T模型评估了4种语言的建议方法。我们提出的ASR途径的表现都优于密集模型和语言不足的模型,并且与单语言稀疏模型相比,在低资源语言上提供了更好的性能。

Neural network pruning compresses automatic speech recognition (ASR) models effectively. However, in multilingual ASR, language-agnostic pruning may lead to severe performance drops on some languages because language-agnostic pruning masks may not fit all languages and discard important language-specific parameters. In this work, we present ASR pathways, a sparse multilingual ASR model that activates language-specific sub-networks ("pathways"), such that the parameters for each language are learned explicitly. With the overlapping sub-networks, the shared parameters can also enable knowledge transfer for lower-resource languages via joint multilingual training. We propose a novel algorithm to learn ASR pathways, and evaluate the proposed method on 4 languages with a streaming RNN-T model. Our proposed ASR pathways outperform both dense models and a language-agnostically pruned model, and provide better performance on low-resource languages compared to the monolingual sparse models.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源