论文标题

关于音乐流派分类的广播网络的研究

A Study on Broadcast Networks for Music Genre Classification

论文作者

Heakl, Ahmed, Abdelgawad, Abdelrahman, Parque, Victor

论文摘要

由于对音乐流媒体/推荐服务的需求增加以及音乐信息检索框架的最新发展,音乐流派分类(MGC)引起了社区的关注。但是,已知基于卷积的方法缺乏有效编码和定位时间特征的能力。在本文中,我们研究了基于广播的神经网络,旨在提高一小部分参数(约180k)下的本地化和概括性,并研究了第十二个广播网络的变体,讨论了块配置,汇总方法,激活方法,激活功能,归一化机制,标记机制,标记平滑,通道相互依赖性,LSTM Block block block inceplusion和Variants section section schems schems schems schem schem schem schemss schems configuration和Variant Schems bocigration Networks。我们使用相关数据集的计算实验,例如GTZAN,扩展宴会厅,Homburg和Free Music Archive(FMA),显示了音乐类型分类中最新的分类精度。我们的方法提供了洞察力,并有可能使音乐和音频分类的紧凑且可推广的广播网络。

Due to the increased demand for music streaming/recommender services and the recent developments of music information retrieval frameworks, Music Genre Classification (MGC) has attracted the community's attention. However, convolutional-based approaches are known to lack the ability to efficiently encode and localize temporal features. In this paper, we study the broadcast-based neural networks aiming to improve the localization and generalizability under a small set of parameters (about 180k) and investigate twelve variants of broadcast networks discussing the effect of block configuration, pooling method, activation function, normalization mechanism, label smoothing, channel interdependency, LSTM block inclusion, and variants of inception schemes. Our computational experiments using relevant datasets such as GTZAN, Extended Ballroom, HOMBURG, and Free Music Archive (FMA) show state-of-the-art classification accuracies in Music Genre Classification. Our approach offers insights and the potential to enable compact and generalizable broadcast networks for music and audio classification.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源