具有固定输出分布的变异自动编码器旋转的数字识别

论文标题

具有固定输出分布的变异自动编码器旋转的数字识别

Rotated Digit Recognition by Variational Autoencoders with Fixed Output Distributions

论文作者

Yevick, David

论文摘要

本文表明，对变异自动编码器（VAE）形式主义的简单修改使方法可以识别和分类旋转和扭曲的数字。特别是，在VAE的训练过程中采用的常规目标（成本）功能既量化了输入数据和输出数据记录之间的协议，又确保了输入数据记录的潜在空间表示形式是统计生成的，并具有适当的均值和标准偏差。训练后，通过解码适当的潜在空间点来生成模拟数据实现。但是，由于在随机旋转的MNIST数字上受过训练的标准VAE：S无法可靠地区分不同的数字类别，因为与类似旋转的输出数据记录相比，旋转的输入数据有效地比较。相反，此处显示的替代实现将与每个旋转数字相关联的输出与相应的固定未引用的参考数数进行比较，即使潜在空间的尺寸为2或3，此处都会显示出可准确区分潜在空间中的旋转数字。

This paper demonstrates that a simple modification of the variational autoencoder (VAE) formalism enables the method to identify and classify rotated and distorted digits. In particular, the conventional objective (cost) function employed during the training process of a VAE both quantifies the agreement between the input and output data records and ensures that the latent space representation of the input data record is statistically generated with an appropriate mean and standard deviation. After training, simulated data realizations are generated by decoding appropriate latent space points. Since, however, standard VAE:s trained on randomly rotated MNIST digits cannot reliably distinguish between different digit classes since the rotated input data is effectively compared to a similarly rotated output data record. In contrast, an alternative implementation in which the objective function compares the output associated with each rotated digit to a corresponding fixed unreferenced reference digit is shown here to discriminate accurately among the rotated digits in latent space even when the dimension of the latent space is 2 or 3.

下载PDF全文

下载文献需遵守相关版权规定

论文标题