论文标题
学习对跨色调识别的快速适应
Learning Fast Adaptation on Cross-Accented Speech Recognition
论文作者
论文摘要
本地方言会影响人们的发音相同语言的词与彼此不同。口音的巨大变异性和复杂特征为训练强大而强调的自动语音识别(ASR)系统带来了重大挑战。在本文中,我们介绍了一项跨重心的英语语音识别任务,作为测量模型使用现有的CommonVoice语料库适应不见口音的能力的基准。我们还提出了一种强调方法,该方法扩展了模型 - 不合稳定的元学习(MAML)算法,以快速适应未看到的口音。我们的方法在单词错误率方面,在零射,很少的和全射门的零击,很少的和全射线中都大大优于联合培训。
Local dialects influence people to pronounce words of the same language differently from each other. The great variability and complex characteristics of accents creates a major challenge for training a robust and accent-agnostic automatic speech recognition (ASR) system. In this paper, we introduce a cross-accented English speech recognition task as a benchmark for measuring the ability of the model to adapt to unseen accents using the existing CommonVoice corpus. We also propose an accent-agnostic approach that extends the model-agnostic meta-learning (MAML) algorithm for fast adaptation to unseen accents. Our approach significantly outperforms joint training in both zero-shot, few-shot, and all-shot in the mixed-region and cross-region settings in terms of word error rate.