时域对抗语音转换添加2022

论文标题

时域对抗语音转换添加2022

Time Domain Adversarial Voice Conversion for ADD 2022

论文作者

Wen, Cheng, Guo, Tingwei, Tan, Xingjun, Yan, Rui, Zhou, Shuran, Xie, Chuandong, Zou, Wei, Li, Xiangang

论文摘要

在本文中，我们描述了第一个音频深层综合检测挑战的语音生成系统（ADD 2022）。首先，我们构建一个任何对数量的语音转换（VC）系统，将使用任意语言内容的源语音转换为目标扬声器％U2019的伪造语音。然后，从VC产生的转换语音将在时域进行后处理，以提高欺骗能力。实验结果表明，我们的系统具有对抗抗爆炸探测器的对抗能力，其音频质量和扬声器相似性略有妥协。该系统在ADD 2022中排名最高的轨道3.1，表明我们的方法还可以对不同的检测器获得良好的概括能力。

In this paper, we describe our speech generation system for the first Audio Deep Synthesis Detection Challenge (ADD 2022). Firstly, we build an any-to-many voice conversion (VC) system to convert source speech with arbitrary language content into the target speaker%u2019s fake speech. Then the converted speech generated from VC is post-processed in the time domain to improve the deception ability. The experimental results show that our system has adversarial ability against anti-spoofing detectors with a little compromise in audio quality and speaker similarity. This system ranks top in Track 3.1 in the ADD 2022, showing that our method could also gain good generalization ability against different detectors.

下载PDF全文

下载文献需遵守相关版权规定

论文标题