论文标题
Bert-Lid:利用Bert来改善口语识别
BERT-LID: Leveraging BERT to Improve Spoken Language Identification
论文作者
论文摘要
语言识别是自动确定口语传达语言的身份的任务。它对智能语音系统的多语言互操作性产生了深远的影响。尽管语言鉴定在中等或长时间的话语上具有很高的精度(> 3s),但短语(<= 1s)的表现仍然远非令人满意。我们提出了一个基于BERT的语言识别系统(BERT-LID),以提高语言识别性能,尤其是在短期语音段。我们通过将来自前端手机识别器作为输入的语音后验(PPG)进行输入来扩展原始的BERT模型。然后,我们部署了最佳的深层分类器,然后将其用于语言识别。我们的BERT-LID模型可以在长段识别中提高基线准确性约6.5%,而在短段识别方面可以提高19.9%,这证明了我们的Bert-Lid对语言识别的有效性。
Language identification is the task of automatically determining the identity of a language conveyed by a spoken segment. It has a profound impact on the multilingual interoperability of an intelligent speech system. Despite language identification attaining high accuracy on medium or long utterances(>3s), the performance on short utterances (<=1s) is still far from satisfactory. We propose a BERT-based language identification system (BERT-LID) to improve language identification performance, especially on short-duration speech segments. We extend the original BERT model by taking the phonetic posteriorgrams (PPG) derived from the front-end phone recognizer as input. Then we deployed the optimal deep classifier followed by it for language identification. Our BERT-LID model can improve the baseline accuracy by about 6.5% on long-segment identification and 19.9% on short-segment identification, demonstrating our BERT-LID's effectiveness to language identification.