学习对跨色调识别的快速适应

论文标题

学习对跨色调识别的快速适应

Learning Fast Adaptation on Cross-Accented Speech Recognition

论文作者

Winata, Genta Indra, Cahyawijaya, Samuel, Liu, Zihan, Lin, Zhaojiang, Madotto, Andrea, Xu, Peng, Fung, Pascale

论文摘要

本地方言会影响人们的发音相同语言的词与彼此不同。口音的巨大变异性和复杂特征为训练强大而强调的自动语音识别（ASR）系统带来了重大挑战。在本文中，我们介绍了一项跨重心的英语语音识别任务，作为测量模型使用现有的CommonVoice语料库适应不见口音的能力的基准。我们还提出了一种强调方法，该方法扩展了模型 - 不合稳定的元学习（MAML）算法，以快速适应未看到的口音。我们的方法在单词错误率方面，在零射，很少的和全射门的零击，很少的和全射线中都大大优于联合培训。

Local dialects influence people to pronounce words of the same language differently from each other. The great variability and complex characteristics of accents creates a major challenge for training a robust and accent-agnostic automatic speech recognition (ASR) system. In this paper, we introduce a cross-accented English speech recognition task as a benchmark for measuring the ability of the model to adapt to unseen accents using the existing CommonVoice corpus. We also propose an accent-agnostic approach that extends the model-agnostic meta-learning (MAML) algorithm for fast adaptation to unseen accents. Our approach significantly outperforms joint training in both zero-shot, few-shot, and all-shot in the mixed-region and cross-region settings in terms of word error rate.

下载PDF全文

下载文献需遵守相关版权规定

论文标题