跨模式神经模型重编程的低资源音乐类型分类

论文标题

跨模式神经模型重编程的低资源音乐类型分类

Low-Resource Music Genre Classification with Cross-Modal Neural Model Reprogramming

论文作者

Hung, Yun-Ning, Yang, Chao-Han Huck, Chen, Pin-Yu, Lerch, Alexander

论文摘要

转移学习（TL）方法在用有限的培训数据处理任务时显示出令人鼓舞的结果。但是，对于使用目标域数据进行微调预训练的神经网络通常需要大量的内存和计算资源。在这项工作中，我们介绍了一种新的方法，用于利用基于神经模型重编程（NMR）概念的低资源（音乐）分类的预训练模型。 NMR旨在通过修改冷冻预训练的模型的输入，将预训练的模型从源域重新训练为目标域。除了已知的，独立于输入的重新编程方法外，我们还提出了一个先进的重编程范式：输入依赖性NMR，以增加对复杂输入数据（例如音乐音频）的适应性。实验结果表明，在大规模数据集中预先训练的神经模型可以通过使用这种重编程方法成功地执行音乐流派分类。在小型类型分类数据集上，这两个提出的输入依赖性NMR TL方法优于基于微调的TL方法。

Transfer learning (TL) approaches have shown promising results when handling tasks with limited training data. However, considerable memory and computational resources are often required for fine-tuning pre-trained neural networks with target domain data. In this work, we introduce a novel method for leveraging pre-trained models for low-resource (music) classification based on the concept of Neural Model Reprogramming (NMR). NMR aims at re-purposing a pre-trained model from a source domain to a target domain by modifying the input of a frozen pre-trained model. In addition to the known, input-independent, reprogramming method, we propose an advanced reprogramming paradigm: Input-dependent NMR, to increase adaptability to complex input data such as musical audio. Experimental results suggest that a neural model pre-trained on large-scale datasets can successfully perform music genre classification by using this reprogramming method. The two proposed Input-dependent NMR TL methods outperform fine-tuning-based TL methods on a small genre classification dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题