对抗自动编码器的学习风格感知的符号音乐表示

论文标题

对抗自动编码器的学习风格感知的符号音乐表示

Learning Style-Aware Symbolic Music Representations by Adversarial Autoencoders

论文作者

Valenti, Andrea, Carta, Antonio, Bacciu, Davide

论文摘要

我们解决了在生成音乐建模中学习有效的潜在空间的挑战性开放问题。我们专注于利用对抗性正则化，作为一种灵活而自然的意义，以使各种自动编码器具有有关音乐类型和样式的上下文信息。通过论文，我们展示了如何考虑音乐元数据信息的高斯混合物，可以用作自动编码器潜在空间的有效先验，并介绍了第一个音乐对抗性自动编码器（Musae）。大规模基准的经验分析表明，与基于标准变异自动编码器的最先进模型相比，我们的模型具有更高的重建精度。它还能够在两个音乐序列之间创建逼真的插值，从而平稳地改变了不同曲目的动态。实验表明，该模型可以将其潜在空间相应地组织到音乐作品的低级特性中，并嵌入潜在变量中，从先前的分布中注入的高级流派信息以提高其整体性能。这使我们能够以原则上的方式对生成的作品进行更改。

We address the challenging open problem of learning an effective latent space for symbolic music data in generative music modeling. We focus on leveraging adversarial regularization as a flexible and natural mean to imbue variational autoencoders with context information concerning music genre and style. Through the paper, we show how Gaussian mixtures taking into account music metadata information can be used as an effective prior for the autoencoder latent space, introducing the first Music Adversarial Autoencoder (MusAE). The empirical analysis on a large scale benchmark shows that our model has a higher reconstruction accuracy than state-of-the-art models based on standard variational autoencoders. It is also able to create realistic interpolations between two musical sequences, smoothly changing the dynamics of the different tracks. Experiments show that the model can organise its latent space accordingly to low-level properties of the musical pieces, as well as to embed into the latent variables the high-level genre information injected from the prior distribution to increase its overall performance. This allows us to perform changes to the generated pieces in a principled way.

下载PDF全文

下载文献需遵守相关版权规定

论文标题