音乐源分离中转移学习的研究

论文标题

音乐源分离中转移学习的研究

A Study of Transfer Learning in Music Source Separation

论文作者

Bugler, Andreas, Pardo, Bryan, Seetharaman, Prem

论文摘要

在有大量培训数据的域中，用于执行音频源分离的深度学习方法可能非常有效。尽管某些音乐领域的数据适合训练分离系统，例如摇滚和流行音乐类型，但许多音乐领域都没有，例如古典音乐，合唱音乐和非西方音乐传统。众所周知，从相关领域转移学习可能会导致深度学习系统的性能提升，但并不总是很清楚如何最好地进行训练。在这项工作中，我们调查了预训练期间数据增加的有效性，由于训练和下游数据集具有相似的内容域，对性能的影响，并探索一旦预定，在最终目标任务上必须重新训练多少模型。

Supervised deep learning methods for performing audio source separation can be very effective in domains where there is a large amount of training data. While some music domains have enough data suitable for training a separation system, such as rock and pop genres, many musical domains do not, such as classical music, choral music, and non-Western music traditions. It is well known that transferring learning from related domains can result in a performance boost for deep learning systems, but it is not always clear how best to do pretraining. In this work we investigate the effectiveness of data augmentation during pretraining, the impact on performance as a result of pretraining and downstream datasets having similar content domains, and also explore how much of a model must be retrained on the final target task, once pretrained.

下载PDF全文

下载文献需遵守相关版权规定

论文标题