论文标题

通过人工神经网络建模音乐结构

Modeling Musical Structure with Artificial Neural Networks

论文作者

Lattner, Stefan

论文摘要

近年来,人工神经网络(ANN)已成为解决现实世界问题的通用工具。 ANN在与音乐相关的任务中还显示出巨大的成功,包括音乐摘要和分类,相似性估计,计算机辅助或自主构图以及自动音乐分析。由于结构是西方音乐的基本特征,因此它在所有这些任务中都起着作用。在当前的ANN架构中学习一些结构方面特别具有挑战性。对于中和高级的自相似性,色调和节奏关系尤其如此。在本文中,我探讨了ANN在音乐结构建模的不同方面的应用,确定涉及的一些挑战,并提出解决这些挑战的策略。首先,使用限制性玻尔兹曼机器(RBM)的概率估计,研究了一种概率的自下而上的旋律分割方法。然后,提出了一种自上而下的方法,用于在音乐生成中施加高级结构模板,该方法使用卷积RBM与中间解决方案的梯度降低优化结合了Gibbs采样。此外,我激发了音乐转换在结构建模中的相关性,并展示了连接主义模型如何使用封闭式自动编码器(GAE)来学习音乐片段之间的转换。对于序列的学习转换,我提出了对GAE的特殊预测训练,该训练会产生多形音乐作为一系列间隔的表示。此外,显示了这些间隔表示形式对重复音乐节的自上而下的发现的适用性。最后,提出了GAE的经常性变体,并证明了其在音乐预测中的功效和低级重复结构的建模。

In recent years, artificial neural networks (ANNs) have become a universal tool for tackling real-world problems. ANNs have also shown great success in music-related tasks including music summarization and classification, similarity estimation, computer-aided or autonomous composition, and automatic music analysis. As structure is a fundamental characteristic of Western music, it plays a role in all these tasks. Some structural aspects are particularly challenging to learn with current ANN architectures. This is especially true for mid- and high-level self-similarity, tonal and rhythmic relationships. In this thesis, I explore the application of ANNs to different aspects of musical structure modeling, identify some challenges involved and propose strategies to address them. First, using probability estimations of a Restricted Boltzmann Machine (RBM), a probabilistic bottom-up approach to melody segmentation is studied. Then, a top-down method for imposing a high-level structural template in music generation is presented, which combines Gibbs sampling using a convolutional RBM with gradient-descent optimization on the intermediate solutions. Furthermore, I motivate the relevance of musical transformations in structure modeling and show how a connectionist model, the Gated Autoencoder (GAE), can be employed to learn transformations between musical fragments. For learning transformations in sequences, I propose a special predictive training of the GAE, which yields a representation of polyphonic music as a sequence of intervals. Furthermore, the applicability of these interval representations to a top-down discovery of repeated musical sections is shown. Finally, a recurrent variant of the GAE is proposed, and its efficacy in music prediction and modeling of low-level repetition structure is demonstrated.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源