论文标题
贝叶斯子空间hmm for Zerospeech 2020挑战
Bayesian Subspace HMM for the Zerospeech 2020 Challenge
论文作者
论文摘要
在本文中,我们描述了我们对Zerospeech 2020 Challenge的提交,在此挑战中,参与者必须从未经宣传的语音中发现潜在的表示,并使用这些表示形式执行语音综合,并将合成质量用作单位质量的代理度量。在我们的系统中,我们使用贝叶斯子空间隐藏的马尔可夫模型(SHMM)进行单位发现。 SHMM将每个单元建模为HMM,其参数被限制为位于总参数空间的低维子空间中,该空间经过训练以模拟语音变异性。我们的系统与对人体评估的字符错误率的基线相比,同时保持较低的单位比特率。
In this paper we describe our submission to the Zerospeech 2020 challenge, where the participants are required to discover latent representations from unannotated speech, and to use those representations to perform speech synthesis, with synthesis quality used as a proxy metric for the unit quality. In our system, we use the Bayesian Subspace Hidden Markov Model (SHMM) for unit discovery. The SHMM models each unit as an HMM whose parameters are constrained to lie in a low dimensional subspace of the total parameter space which is trained to model phonetic variability. Our system compares favorably with the baseline on the human-evaluated character error rate while maintaining significantly lower unit bitrate.