Bert-Lid：利用Bert来改善口语识别

论文标题

Bert-Lid：利用Bert来改善口语识别

BERT-LID: Leveraging BERT to Improve Spoken Language Identification

论文作者

Nie, Yuting, Zhao, Junhong, Zhang, Wei-Qiang, Bai, Jinfeng

论文摘要

语言识别是自动确定口语传达语言的身份的任务。它对智能语音系统的多语言互操作性产生了深远的影响。尽管语言鉴定在中等或长时间的话语上具有很高的精度（> 3s），但短语（<= 1s）的表现仍然远非令人满意。我们提出了一个基于BERT的语言识别系统（BERT-LID），以提高语言识别性能，尤其是在短期语音段。我们通过将来自前端手机识别器作为输入的语音后验（PPG）进行输入来扩展原始的BERT模型。然后，我们部署了最佳的深层分类器，然后将其用于语言识别。我们的BERT-LID模型可以在长段识别中提高基线准确性约6.5％，而在短段识别方面可以提高19.9％，这证明了我们的Bert-Lid对语言识别的有效性。

Language identification is the task of automatically determining the identity of a language conveyed by a spoken segment. It has a profound impact on the multilingual interoperability of an intelligent speech system. Despite language identification attaining high accuracy on medium or long utterances(>3s), the performance on short utterances (<=1s) is still far from satisfactory. We propose a BERT-based language identification system (BERT-LID) to improve language identification performance, especially on short-duration speech segments. We extend the original BERT model by taking the phonetic posteriorgrams (PPG) derived from the front-end phone recognizer as input. Then we deployed the optimal deep classifier followed by it for language identification. Our BERT-LID model can improve the baseline accuracy by about 6.5% on long-segment identification and 19.9% on short-segment identification, demonstrating our BERT-LID's effectiveness to language identification.

下载PDF全文

下载文献需遵守相关版权规定

论文标题