论文标题
Helicality:音频数据中基于ISOMAP的八度等效度的度量
Helicality: An Isomap-based Measure of Octave Equivalence in Audio Data
论文作者
论文摘要
八度等效性作为miR系统中的域知识,包括Chromagram,螺旋卷积网络和Harmonic CQT。先前的工作已将ISOMAP歧管学习算法应用于未标记的音频数据,以在3D空间中的嵌入频率子带上,在3-D空间中,欧几里得距离与其Pearson相关性的强度成反比。但是,通过ISOMAP发现八度等效性需要视觉检查,并且不可扩展。为了解决这个问题,我们将“螺旋性”定义为3-D ISOMAP嵌入到牧羊人旋转螺旋中的拟合良好。我们的方法是无监督的,并使用自定义的Frank-Wolfe算法最大程度地减少了凸船体内的最小二乘物镜。数值实验表明,孤立的音符比语音更高,然后是鼓击。
Octave equivalence serves as domain-knowledge in MIR systems, including chromagram, spiral convolutional networks, and harmonic CQT. Prior work has applied the Isomap manifold learning algorithm to unlabeled audio data to embed frequency sub-bands in 3-D space where the Euclidean distances are inversely proportional to the strength of their Pearson correlations. However, discovering octave equivalence via Isomap requires visual inspection and is not scalable. To address this problem, we define "helicality" as the goodness of fit of the 3-D Isomap embedding to a Shepherd-Risset helix. Our method is unsupervised and uses a custom Frank-Wolfe algorithm to minimize a least-squares objective inside a convex hull. Numerical experiments indicate that isolated musical notes have a higher helicality than speech, followed by drum hits.