混合特定的数据增强技术，用于改进盲质小提琴/钢琴源分离

论文标题

混合特定的数据增强技术，用于改进盲质小提琴/钢琴源分离

Mixing-Specific Data Augmentation Techniques for Improved Blind Violin/Piano Source Separation

论文作者

Chiu, Ching-Yu, Hsiao, Wen-Yi, Yeh, Yin-Cheng, Yang, Yi-Hsuan, Su, Alvin Wen-Yu

论文摘要

在音乐信息检索和信号处理社区中，盲目的音乐分离一直是一个流行而积极的研究主题。为了应对缺乏用于监督模型培训的可用多轨数据，在最近的作品中显示了一种数据增强方法，该方法通过结合不同歌曲的曲目来创建人工混合物。遵循这一目光，我们在本文中进一步研究了扩展数据增强方法，这些方法考虑了现代音乐生产程序中采用的更复杂的混合设置，要组合的曲目之间的关系以及沉默的因素。作为一个案例研究，我们考虑小提琴钢琴合奏中小提琴和钢琴曲目的分离，从常见指标，即SDR，SIR和SAR评估表现。除了检查这些新数据增强方法的有效性外，我们还研究了培训数据量的影响。我们的评估表明，提出的特定混合数据增强方法可以帮助提高基于深度学习的模型进行源分离的性能，尤其是在小型培训数据的情况下。

Blind music source separation has been a popular and active subject of research in both the music information retrieval and signal processing communities. To counter the lack of available multi-track data for supervised model training, a data augmentation method that creates artificial mixtures by combining tracks from different songs has been shown useful in recent works. Following this light, we examine further in this paper extended data augmentation methods that consider more sophisticated mixing settings employed in the modern music production routine, the relationship between the tracks to be combined, and factors of silence. As a case study, we consider the separation of violin and piano tracks in a violin piano ensemble, evaluating the performance in terms of common metrics, namely SDR, SIR, and SAR. In addition to examining the effectiveness of these new data augmentation methods, we also study the influence of the amount of training data. Our evaluation shows that the proposed mixing-specific data augmentation methods can help improve the performance of a deep learning-based model for source separation, especially in the case of small training data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题