通过自我注意力在波形域中言语deno

论文标题

通过自我注意力在波形域中言语deno

Speech Denoising in the Waveform Domain with Self-Attention

论文作者

Kong, Zhifeng, Ping, Wei, Dantrey, Ambrish, Catanzaro, Bryan

论文摘要

在这项工作中，我们提出了清洁nunet，这是原始波形上的因果语音deno的模型。所提出的模型基于编码器架构，并结合了几个自我注意块，以完善其瓶颈表示，这对于获得良好的结果至关重要。该模型是通过在波形和多分辨率光谱图上定义的一组损失进行了优化的。所提出的方法从各种客观和主观评估指标中的言语质量方面优于最先进的模型。我们在https://github.com/nvidia/cleanunet上发布代码和模型。

In this work, we present CleanUNet, a causal speech denoising model on the raw waveform. The proposed model is based on an encoder-decoder architecture combined with several self-attention blocks to refine its bottleneck representations, which is crucial to obtain good results. The model is optimized through a set of losses defined over both waveform and multi-resolution spectrograms. The proposed method outperforms the state-of-the-art models in terms of denoised speech quality from various objective and subjective evaluation metrics. We release our code and models at https://github.com/nvidia/cleanunet.

下载PDF全文

下载文献需遵守相关版权规定

论文标题