旋转不变的自动编码器，用于SPHERES的信号

论文标题

旋转不变的自动编码器，用于SPHERES的信号

Rotation-Invariant Autoencoders for Signals on Spheres

论文作者

Lohit, Suhas, Trivedi, Shubhendu

论文摘要

$ 3D $形状的全向图像和球形表示不能用常规的2D卷积神经网络（CNN）处理，因为拆开会导致巨大的失真。使用球形和$ SO（3）$卷积的快速实现，研究人员最近开发了更适合分类球形图像的深度学习方法。这些新提出的卷积层自然会将卷积的概念扩展到单位球上的功能$ s^2 $和一组旋转$ SO（3）$，并且这些层与3D旋转相当。在本文中，我们考虑了对球形图像的旋转不变表示的无监督学习的问题。特别是，我们仔细设计了一个由$ S^2 $和$ SO（3）$卷积层组成的自动编码器体系结构。由于3D旋转通常是一个令人讨厌的因素，因此潜在空间被限制为这些输入转换完全不变。随着旋转信息在潜在空间中丢弃，我们为训练网络制作了一种新颖的旋转损失功能。多个数据集上的广泛实验证明了学习表示对聚类，检索和分类应用程序的有用性。

Omnidirectional images and spherical representations of $3D$ shapes cannot be processed with conventional 2D convolutional neural networks (CNNs) as the unwrapping leads to large distortion. Using fast implementations of spherical and $SO(3)$ convolutions, researchers have recently developed deep learning methods better suited for classifying spherical images. These newly proposed convolutional layers naturally extend the notion of convolution to functions on the unit sphere $S^2$ and the group of rotations $SO(3)$ and these layers are equivariant to 3D rotations. In this paper, we consider the problem of unsupervised learning of rotation-invariant representations for spherical images. In particular, we carefully design an autoencoder architecture consisting of $S^2$ and $SO(3)$ convolutional layers. As 3D rotations are often a nuisance factor, the latent space is constrained to be exactly invariant to these input transformations. As the rotation information is discarded in the latent space, we craft a novel rotation-invariant loss function for training the network. Extensive experiments on multiple datasets demonstrate the usefulness of the learned representations on clustering, retrieval and classification applications.

下载PDF全文

下载文献需遵守相关版权规定

论文标题