论文标题
通过适应域的自我监督语音处理任务改善失真的鲁棒性
Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation
论文作者
论文摘要
语音扭曲是一个长期存在的问题,它降低了受过监督训练的语音处理模型的性能。现在是时候提高语音处理模型的鲁棒性,以在遇到语音扭曲时获得良好的性能,而不会伤害干净的语音上的原始表现。在这项工作中,我们建议通过域对抗训练(DAT)提高语音处理模型的鲁棒性。我们根据五个不同的语音处理任务的精湛框架进行了实验。如果我们并不总是对语音数据的失真类型有所了解,我们分析了二进制域和多域设置,其中前者将所有扭曲的语音视为一个域,而后者将不同的扭曲视为不同的域。与监督训练方法相反,我们在目标域中获得了有希望的结果,在这些目标域中,语音数据因不同的扭曲而扭曲,包括在测试过程中引入的新看不见的扭曲。
Speech distortions are a long-standing problem that degrades the performance of supervisely trained speech processing models. It is high time that we enhance the robustness of speech processing models to obtain good performance when encountering speech distortions while not hurting the original performance on clean speech. In this work, we propose to improve the robustness of speech processing models by domain adversarial training (DAT). We conducted experiments based on the SUPERB framework on five different speech processing tasks. In case we do not always have knowledge of the distortion types for speech data, we analyzed the binary-domain and multi-domain settings, where the former treats all distorted speech as one domain, and the latter views different distortions as different domains. In contrast to supervised training methods, we obtained promising results in target domains where speech data is distorted with different distortions including new unseen distortions introduced during testing.