分析端到端神经模型的鲁棒性以自动语音识别

论文标题

分析端到端神经模型的鲁棒性以自动语音识别

Analyzing Robustness of End-to-End Neural Models for Automatic Speech Recognition

论文作者

Rajendran, Goutham, Zou, Wei

论文摘要

我们研究了预训练的神经模型的鲁棒性特性，以自动语音识别。机器学习中的现实生活数据通常非常嘈杂，几乎永远不会干净，这可以归因于各种因素，具体取决于域，例如异常值，随机噪声和对抗性噪声。因此，我们为各种任务开发的模型应该对这种嘈杂的数据具有鲁棒性，这导致了强大的机器学习的蓬勃发展。我们认为在自动语音识别的情况下考虑了这个重要问题。随着预训练模型的日益普及，分析和理解此类模型对噪声的鲁棒性是一个重要问题。在这项工作中，我们对LibrisPeech和Timit数据集对预训练的神经模型Wav2Vec2，Hubert和Distilhubert进行了鲁棒性分析。我们使用不同种类的Nunising机制，并测量由推理时间和标准单词错误率指标量化的模型性能。当在两层之间注入噪声时，我们还对WAV2VEC2模型进行了深入的层分析，从而使我们能够在高级别上预测每个层学习的内容。最后，对于此模型，我们可视化整个层中错误的传播，并比较它在清洁数据与嘈杂数据上的行为。我们的实验构成了Pasad等人的预测。 [2021]，还为将来的工作提出了有趣的方向。

We investigate robustness properties of pre-trained neural models for automatic speech recognition. Real life data in machine learning is usually very noisy and almost never clean, which can be attributed to various factors depending on the domain, e.g. outliers, random noise and adversarial noise. Therefore, the models we develop for various tasks should be robust to such kinds of noisy data, which led to the thriving field of robust machine learning. We consider this important issue in the setting of automatic speech recognition. With the increasing popularity of pre-trained models, it's an important question to analyze and understand the robustness of such models to noise. In this work, we perform a robustness analysis of the pre-trained neural models wav2vec2, HuBERT and DistilHuBERT on the LibriSpeech and TIMIT datasets. We use different kinds of noising mechanisms and measure the model performances as quantified by the inference time and the standard Word Error Rate metric. We also do an in-depth layer-wise analysis of the wav2vec2 model when injecting noise in between layers, enabling us to predict at a high level what each layer learns. Finally for this model, we visualize the propagation of errors across the layers and compare how it behaves on clean versus noisy data. Our experiments conform the predictions of Pasad et al. [2021] and also raise interesting directions for future work.

下载PDF全文

下载文献需遵守相关版权规定

论文标题