论文标题
残留信号及其应用的确定性加上随机模型
The Deterministic plus Stochastic Model of the Residual Signal and its Applications
论文作者
论文摘要
语音产生的建模通常依赖于源过滤器方法。尽管现在参数化过滤器的方法已经达到了一定的成熟度,但是对于几个语音处理应用程序,仍有很多值得获得的方法。该手稿提出了残留信号的确定性加上随机模型(DSM)。 DSM由两种作用在两个不同的光谱带中作用,这些谱带由最大声音频率界定。这两个组件都是从对倾斜同步残留帧的扬声器依赖性数据集进行的分析中提取的。确定性部分模拟了低频内容,并源于这些框架的正交分解。至于随机分量,它是一种在时间和频率上调制的高频噪声。 DSM的一些有趣的语音和计算属性也被突出显示。然后研究了DSM在两个语音处理领域的适用性。首先,结果表明,将DSM Vocoder纳入基于HMM的语音合成中可以提高交付的质量。事实证明,提出的方法显着胜过传统的脉冲激发,并提供了等同于笔直的质量。在第二次应用中,研究了从提出的DSM衍生出的发光签名的潜力,以实现说话者识别目的。有趣的是,与其他基于震颤的方法相比,这些签名可导致更好的识别率。
The modeling of speech production often relies on a source-filter approach. Although methods parameterizing the filter have nowadays reached a certain maturity, there is still a lot to be gained for several speech processing applications in finding an appropriate excitation model. This manuscript presents a Deterministic plus Stochastic Model (DSM) of the residual signal. The DSM consists of two contributions acting in two distinct spectral bands delimited by a maximum voiced frequency. Both components are extracted from an analysis performed on a speaker-dependent dataset of pitch-synchronous residual frames. The deterministic part models the low-frequency contents and arises from an orthonormal decomposition of these frames. As for the stochastic component, it is a high-frequency noise modulated both in time and frequency. Some interesting phonetic and computational properties of the DSM are also highlighted. The applicability of the DSM in two fields of speech processing is then studied. First, it is shown that incorporating the DSM vocoder in HMM-based speech synthesis enhances the delivered quality. The proposed approach turns out to significantly outperform the traditional pulse excitation and provides a quality equivalent to STRAIGHT. In a second application, the potential of glottal signatures derived from the proposed DSM is investigated for speaker identification purpose. Interestingly, these signatures are shown to lead to better recognition rates than other glottal-based methods.