论文标题

不断发展的多分辨率合并CNN用于单声道歌声分离

Evolving Multi-Resolution Pooling CNN for Monaural Singing Voice Separation

论文作者

Yuan, Weitao, Dong, Bofei, Wang, Shengbei, Unoki, Masashi, Wang, Wenwu

论文摘要

单声道歌声分离(MSV)是一项具有挑战性的任务,已经研究了数十年。深神经网络(DNN)是MSV的当前最新方法。但是,现有的DNN通常是手动设计的,这是耗时且容易出错的。此外,网络架构通常是预定义的,并且不适合培训数据。为了解决这些问题,我们将神经体系结构搜索(NAS)方法引入了MSV的DNN的结构设计。具体而言,我们为MSVS提出了一个新的多分辨率卷积神经网络(CNN)框架,即多分辨率合并CNN(MRP-CNN),该框架使用各种大小的合并操作员来提取多分辨率特征。然后,基于NAS,我们通过使用遗传算法自动搜索有效的MRP-CNN结构来开发不断发展的MRP-CNN(E-MRP-CNN),该结构自动搜索有效的MRP-CNN结构,该结构是根据单个目标进行优化的,它仅考虑一个仅考虑分离性能,或考虑多实体性能或考虑模型的复杂性和模型复杂性。多目标E-MRP-CNN给出了一组帕累托最佳解决方案,每种解决方案在分离性能和模型复杂性之间提供了权衡。对miR-1K和DSD100数据集进行的定量和定性评估用于证明与最近几个基线相比,提出的框架的优势。

Monaural Singing Voice Separation (MSVS) is a challenging task and has been studied for decades. Deep neural networks (DNNs) are the current state-of-the-art methods for MSVS. However, the existing DNNs are often designed manually, which is time-consuming and error-prone. In addition, the network architectures are usually pre-defined, and not adapted to the training data. To address these issues, we introduce a Neural Architecture Search (NAS) method to the structure design of DNNs for MSVS. Specifically, we propose a new multi-resolution Convolutional Neural Network (CNN) framework for MSVS namely Multi-Resolution Pooling CNN (MRP-CNN), which uses various-size pooling operators to extract multi-resolution features. Based on the NAS, we then develop an evolving framework namely Evolving MRP-CNN (E-MRP-CNN), by automatically searching the effective MRP-CNN structures using genetic algorithms, optimized in terms of a single-objective considering only separation performance, or multi-objective considering both the separation performance and the model complexity. The multi-objective E-MRP-CNN gives a set of Pareto-optimal solutions, each providing a trade-off between separation performance and model complexity. Quantitative and qualitative evaluations on the MIR-1K and DSD100 datasets are used to demonstrate the advantages of the proposed framework over several recent baselines.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源