论文标题
Voicemos挑战2022
The VoiceMOS Challenge 2022
论文作者
论文摘要
我们介绍了Voicemos Challenge的第一版,这是一个科学事件,旨在促进对合成语音的平均意见评分(MOS)自动预测的研究。这项挑战吸引了来自学术界和行业的22个参与团队,他们尝试了各种方法来解决预测人类综合语音评级的问题。挑战主要轨道的听力测试数据包括来自十年研究的187种不同文本到语音转换系统的样本,以及跨域的轨道轨道包括来自在单独的听力测试中评级的最新系统的数据。挑战的结果表明,对MOS预测任务进行微调的自我监督语音模型的有效性,以及难以预测看不见的说话者和听众的MOS评分,以及在室外环境中未见的系统的有效性。
We present the first edition of the VoiceMOS Challenge, a scientific event that aims to promote the study of automatic prediction of the mean opinion score (MOS) of synthetic speech. This challenge drew 22 participating teams from academia and industry who tried a variety of approaches to tackle the problem of predicting human ratings of synthesized speech. The listening test data for the main track of the challenge consisted of samples from 187 different text-to-speech and voice conversion systems spanning over a decade of research, and the out-of-domain track consisted of data from more recent systems rated in a separate listening test. Results of the challenge show the effectiveness of fine-tuning self-supervised speech models for the MOS prediction task, as well as the difficulty of predicting MOS ratings for unseen speakers and listeners, and for unseen systems in the out-of-domain setting.