论文标题
对嘈杂和混响条件的三阶段语音转换框架的评估
An Evaluation of Three-Stage Voice Conversion Framework for Noisy and Reverberant Conditions
论文作者
论文摘要
本文提出了一个新的语音转换(VC)框架,能够处理添加噪声和混响及其性能评估。已经研究了一些VC研究,重点是现实情况,其中语音数据干扰了背景噪声和混响。为了应对没有可用的清洁目标数据集的更实际的条件,一种可能的方法是零射击VC,但使用足够数量的目标语音数据与VC相比,其性能往往会降级。为了利用大量嘈杂的目标语音数据,我们建议使用预验证的DeNoising模型基于denoising过程的三阶段VC框架,使用替代模型进行了过渡过程,并使用基于各种自动启动器的非Parealallel VC工艺使用VC工艺。 The experimental results show that 1) noise and reverberation additively cause significant VC performance degradation, 2) the proposed method alleviates the adverse effects caused by both noise and reverberation, and significantly outperforms the baseline directly trained on the noisy-reverberant speech data, and 3) the potential degradation introduced by the denoising and dereverberation still causes noticeable adverse effects on VC performance.
This paper presents a new voice conversion (VC) framework capable of dealing with both additive noise and reverberation, and its performance evaluation. There have been studied some VC researches focusing on real-world circumstances where speech data are interfered with background noise and reverberation. To deal with more practical conditions where no clean target dataset is available, one possible approach is zero-shot VC, but its performance tends to degrade compared with VC using sufficient amount of target speech data. To leverage large amount of noisy-reverberant target speech data, we propose a three-stage VC framework based on denoising process using a pretrained denoising model, dereverberation process using a dereverberation model, and VC process using a nonparallel VC model based on a variational autoencoder. The experimental results show that 1) noise and reverberation additively cause significant VC performance degradation, 2) the proposed method alleviates the adverse effects caused by both noise and reverberation, and significantly outperforms the baseline directly trained on the noisy-reverberant speech data, and 3) the potential degradation introduced by the denoising and dereverberation still causes noticeable adverse effects on VC performance.