论文标题
探测端到端ASR的统计表示
Probing Statistical Representations For End-To-End ASR
论文作者
论文摘要
端到端自动语音识别(ASR)模型旨在学习一般的语音表示以执行识别。在这个领域,几乎没有研究来分析内部表示依赖性及其与建模方法的关系。本文使用SVCCA研究了变压器体系结构中的跨域语言模型依赖性,并使用这些见解来利用建模方法。发现变压器层中的特定神经表示表现出相关行为,会影响识别性能。 总的来说,这项工作提供了影响上下文依赖性和ASR性能的建模方法的分析,可用于创建或适应更好的性能端到端ASR模型以及下游任务。
End-to-End automatic speech recognition (ASR) models aim to learn a generalised speech representation to perform recognition. In this domain there is little research to analyse internal representation dependencies and their relationship to modelling approaches. This paper investigates cross-domain language model dependencies within transformer architectures using SVCCA and uses these insights to exploit modelling approaches. It was found that specific neural representations within the transformer layers exhibit correlated behaviour which impacts recognition performance. Altogether, this work provides analysis of the modelling approaches affecting contextual dependencies and ASR performance, and can be used to create or adapt better performing End-to-End ASR models and also for downstream tasks.