论文标题

通过基于WAV2VEC2的动量伪标记来改善错误发音检测,以进行重音和清晰度评估

Improving Mispronunciation Detection with Wav2vec2-based Momentum Pseudo-Labeling for Accentedness and Intelligibility Assessment

论文作者

Yang, Mu, Hirschi, Kevin, Looney, Stephen D., Kang, Okim, Hansen, John H. L.

论文摘要

当前领先的错误发音检测和诊断(MDD)系统通过端到端音素识别实现有希望的性能。这种端到端解决方案的一个挑战是在自然L2语音上缺乏人类通知的音素。在这项工作中,我们通过伪标记(PL)程序利用未标记的L2语音,并基于预先训练的自我监督学习(SSL)模型扩展了微调方法。具体而言,我们使用WAV2VEC 2.0作为我们的SSL模型,并使用原始标记的L2语音样本以及创建的伪标记的L2语音样本进行微调。我们的伪标签是动态的,并且是由在线模型的合奏生成的,这确保了我们的模型对伪标签噪声具有强大的功能。我们表明,使用伪标签进行微调可实现5.35%的音素错误率降低和2.48%的MDD F1得分在仅标签样本的基线基线上提高。提出的PL方法还显示出胜过常规的离线PL方法。与最先进的MDD系统相比,我们的MDD解决方案会产生更准确,一致的语音误差诊断。此外,我们对单独的UTD-4ACCENTS数据集进行了开放测试,在该数据集中,我们的系统识别输出基于重音和清晰度显示与人类感知有很强的相关性。

Current leading mispronunciation detection and diagnosis (MDD) systems achieve promising performance via end-to-end phoneme recognition. One challenge of such end-to-end solutions is the scarcity of human-annotated phonemes on natural L2 speech. In this work, we leverage unlabeled L2 speech via a pseudo-labeling (PL) procedure and extend the fine-tuning approach based on pre-trained self-supervised learning (SSL) models. Specifically, we use Wav2vec 2.0 as our SSL model, and fine-tune it using original labeled L2 speech samples plus the created pseudo-labeled L2 speech samples. Our pseudo labels are dynamic and are produced by an ensemble of the online model on-the-fly, which ensures that our model is robust to pseudo label noise. We show that fine-tuning with pseudo labels achieves a 5.35% phoneme error rate reduction and 2.48% MDD F1 score improvement over a labeled-samples-only fine-tuning baseline. The proposed PL method is also shown to outperform conventional offline PL methods. Compared to the state-of-the-art MDD systems, our MDD solution produces a more accurate and consistent phonetic error diagnosis. In addition, we conduct an open test on a separate UTD-4Accents dataset, where our system recognition outputs show a strong correlation with human perception, based on accentedness and intelligibility.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源