论文标题

基于张量训练网络的多通道语音增强的张量向量回归

Tensor-to-Vector Regression for Multi-channel Speech Enhancement based on Tensor-Train Network

论文作者

Qi, Jun, Hu, Hu, Wang, Yannan, Yang, Chao-Han Huck, Siniscalchi, Sabato Marco, Lee, Chin-Hui

论文摘要

我们为多通道语音增强的张量向量回归方法提出了一种张量,以解决输入大小爆炸和隐藏尺寸扩展的问题。关键思想是在张量 - 训练网络(TTN)框架下施放基于传统的深神经网络(DNN)基于矢量回归公式。 TTN是最近出现的解决方案,用于与具有完全连接的隐藏层的深层模型的紧凑表示。因此,TTN保持DNN的表现力,但涉及少量可训练的参数。此外,TTN可以按设计处理多维张量输入,这与多频道语音增强中所需的设置完全匹配。我们首先提供了从DNN到基于TTN的回归的理论扩展。接下来,我们表明TTN可以达到与DNN相当的语音增强质量,但参数较少,例如,在单渠道场景中观察到从2700万次减少到仅500万个参数。 TTN还通过稍微增加可训练的参数数量,将DNN的PESQ从2.86提高到2.96。最后,在8通道条件下,使用2000万个TTN参数实现了3.12的PESQ,而6800万参数的DNN只能达到3.06的PESQ。我们的实施可在线获得https://github.com/uwjunqi/tensor-train-neural-network。

We propose a tensor-to-vector regression approach to multi-channel speech enhancement in order to address the issue of input size explosion and hidden-layer size expansion. The key idea is to cast the conventional deep neural network (DNN) based vector-to-vector regression formulation under a tensor-train network (TTN) framework. TTN is a recently emerged solution for compact representation of deep models with fully connected hidden layers. Thus TTN maintains DNN's expressive power yet involves a much smaller amount of trainable parameters. Furthermore, TTN can handle a multi-dimensional tensor input by design, which exactly matches the desired setting in multi-channel speech enhancement. We first provide a theoretical extension from DNN to TTN based regression. Next, we show that TTN can attain speech enhancement quality comparable with that for DNN but with much fewer parameters, e.g., a reduction from 27 million to only 5 million parameters is observed in a single-channel scenario. TTN also improves PESQ over DNN from 2.86 to 2.96 by slightly increasing the number of trainable parameters. Finally, in 8-channel conditions, a PESQ of 3.12 is achieved using 20 million parameters for TTN, whereas a DNN with 68 million parameters can only attain a PESQ of 3.06. Our implementation is available online https://github.com/uwjunqi/Tensor-Train-Neural-Network.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源