点vae：一次解开一个因素

论文标题

点vae：一次解开一个因素

DOT-VAE: Disentangling One Factor at a Time

论文作者

Patil, Vaishnavi, Evanusa, Matthew, JaJa, Joseph

论文摘要

当我们进入机器学习时代，其特征是数据，发现，组织和以无监督的方式对数据的解释变得非常关键需求。这项工作的一种有希望的方法是解开问题的问题，旨在学习数据的潜在生成潜在因素，称为数据的变异因素，并以截然不同的潜在表示。最近的进步已努力解决由固定的一组独立变化因素产生的合成数据集的问题。在这里，我们建议将其扩展到具有可计数数量变化因素的实际数据集。我们提出了一个新颖的框架，该框架可以增加具有分离空间的变异自动编码器的潜在空间，并使用受唤醒启发的两步算法进行训练，以进行无监督的分解。 Our network learns to disentangle interpretable, independent factors from the data ``one at a time", and encode it in different dimensions of the disentangled latent space, while making no prior assumptions about the number of factors or their joint distribution. We demonstrate its quantitative and qualitative effectiveness by evaluating the latent representations learned on two synthetic benchmark datasets; DSprites and 3DShapes and on a real datasets CelebA.

As we enter the era of machine learning characterized by an overabundance of data, discovery, organization, and interpretation of the data in an unsupervised manner becomes a critical need. One promising approach to this endeavour is the problem of Disentanglement, which aims at learning the underlying generative latent factors, called the factors of variation, of the data and encoding them in disjoint latent representations. Recent advances have made efforts to solve this problem for synthetic datasets generated by a fixed set of independent factors of variation. Here, we propose to extend this to real-world datasets with a countable number of factors of variations. We propose a novel framework which augments the latent space of a Variational Autoencoders with a disentangled space and is trained using a Wake-Sleep-inspired two-step algorithm for unsupervised disentanglement. Our network learns to disentangle interpretable, independent factors from the data ``one at a time", and encode it in different dimensions of the disentangled latent space, while making no prior assumptions about the number of factors or their joint distribution. We demonstrate its quantitative and qualitative effectiveness by evaluating the latent representations learned on two synthetic benchmark datasets; DSprites and 3DShapes and on a real datasets CelebA.

下载PDF全文

下载文献需遵守相关版权规定

论文标题