矢量量化基于模板的音乐的对比度预测编码

论文标题

矢量量化基于模板的音乐的对比度预测编码

Vector Quantized Contrastive Predictive Coding for Template-based Music Generation

论文作者

Hadjeres, Gaëtan, Crestel, Léopold

论文摘要

在这项工作中，我们提出了一种灵活的方法来生成离散序列的变化，在该序列中可以将令牌分组为基本单元，例如文本中的句子或音乐中的条形。更确切地说，考虑到模板序列，我们旨在生成与原始模板共享可感知相似性的新序列，而无需依赖任何注释。因此，我们产生变化的问题与在没有监督的情况下学习相关的高级表示的问题密切相关。我们的贡献是两个方面的：首先，我们提出了一种自我监督的编码技术，称为向量的对比度预测编码，该编码允许通过一组离散的代码学习基本单元的有意义的分配，并允许控制这些学识渊博的离散表示的信息内容。其次，我们展示了如何通过在变压器体系结构中使用适当的注意模式来使用这些压缩表示形式来生成模板序列的变化。我们说明了我们关于J.S.语料库的方法Bach Chorales我们讨论了学到的离散代码的音乐含义，并表明我们提出的方法允许生成给定模板的相干和高质量的变体。

In this work, we propose a flexible method for generating variations of discrete sequences in which tokens can be grouped into basic units, like sentences in a text or bars in music. More precisely, given a template sequence, we aim at producing novel sequences sharing perceptible similarities with the original template without relying on any annotation; so our problem of generating variations is intimately linked to the problem of learning relevant high-level representations without supervision. Our contribution is two-fold: First, we propose a self-supervised encoding technique, named Vector Quantized Contrastive Predictive Coding which allows to learn a meaningful assignment of the basic units over a discrete set of codes, together with mechanisms allowing to control the information content of these learnt discrete representations. Secondly, we show how these compressed representations can be used to generate variations of a template sequence by using an appropriate attention pattern in the Transformer architecture. We illustrate our approach on the corpus of J.S. Bach chorales where we discuss the musical meaning of the learnt discrete codes and show that our proposed method allows to generate coherent and high-quality variations of a given template.

下载PDF全文

下载文献需遵守相关版权规定

论文标题