论文标题

快速连接主义者的演讲者适应

Rapid Connectionist Speaker Adaptation

论文作者

Witbrock, Michael, Haffner, Patrick

论文摘要

我们提出SVCNET,这是一种建模扬声器变异性的系统。专门针对每个语音的编码器神经网络产生声学变化的低维模型,这些模型进一步合并为语音可变性的整体模型。描述了一个训练程序,该程序最小化了该模型的依赖性。该系统使用训练有素的模型(SVCNET)和简短的,不受限制的扬声器声音样本,该系统生成了扬声器语音代码,该语音代码可用于在不进行重新训练的情况下将识别系统适应新扬声器。描述了将SVCNET与MS-TDNN识别器相结合的系统

We present SVCnet, a system for modelling speaker variability. Encoder Neural Networks specialized for each speech sound produce low dimensionality models of acoustical variation, and these models are further combined into an overall model of voice variability. A training procedure is described which minimizes the dependence of this model on which sounds have been uttered. Using the trained model (SVCnet) and a brief, unconstrained sample of a new speaker's voice, the system produces a Speaker Voice Code that can be used to adapt a recognition system to the new speaker without retraining. A system which combines SVCnet with an MS-TDNN recognizer is described

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源