论文标题

混音:通过自举混合进行语音增强模型的持续自我训练

RemixIT: Continual self-training of speech enhancement models via bootstrapped remixing

论文作者

Tzinis, Efthymios, Adi, Yossi, Ithapu, Vamsi Krishna, Xu, Buye, Smaragdis, Paris, Kumar, Anurag

论文摘要

我们提出混音,这是一种简单而有效的自我监督方法,用于训练语音增强,而无需单个隔离的内域语音或噪声波形。我们的方法克服了先前方法的局限性,这些方法使它们取决于清洁内域目标信号,因此,对火车和测试样品之间的任何域不匹配敏感。混音基于连续的自我训练方案,在该方案中,预先训练的教师模型涉及域外数据侵蚀器估计的伪靶信号,用于构域混合物。然后,通过将估计的清洁和噪声信号置换并将它们重新混合在一起,我们生成了一组新的自举混合物和相应的假靶标,用于训练学生网络。反之亦然,教师使用最新学生模型的更新参数定期完善其估计。多个语音增强数据集和任务的实验结果不仅显示了我们方法比先前方法的优越性,而且还展示了混音可以与任何分离模型结合在一起,还可以应用于任何半监督和无监督的域适应任务。我们的分析与经验证据相结合,阐明了我们自我训练方案的内部功能,其中学生模型在观察严重降低的伪靶中的情况下不断获得更好的性能。

We present RemixIT, a simple yet effective self-supervised method for training speech enhancement without the need of a single isolated in-domain speech nor a noise waveform. Our approach overcomes limitations of previous methods which make them dependent on clean in-domain target signals and thus, sensitive to any domain mismatch between train and test samples. RemixIT is based on a continuous self-training scheme in which a pre-trained teacher model on out-of-domain data infers estimated pseudo-target signals for in-domain mixtures. Then, by permuting the estimated clean and noise signals and remixing them together, we generate a new set of bootstrapped mixtures and corresponding pseudo-targets which are used to train the student network. Vice-versa, the teacher periodically refines its estimates using the updated parameters of the latest student models. Experimental results on multiple speech enhancement datasets and tasks not only show the superiority of our method over prior approaches but also showcase that RemixIT can be combined with any separation model as well as be applied towards any semi-supervised and unsupervised domain adaptation task. Our analysis, paired with empirical evidence, sheds light on the inside functioning of our self-training scheme wherein the student model keeps obtaining better performance while observing severely degraded pseudo-targets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源