论文标题

旋转不变的自动编码器,用于SPHERES的信号

Rotation-Invariant Autoencoders for Signals on Spheres

论文作者

Lohit, Suhas, Trivedi, Shubhendu

论文摘要

$ 3D $形状的全向图像和球形表示不能用常规的2D卷积神经网络(CNN)处理,因为拆开会导致巨大的失真。使用球形和$ SO(3)$卷积的快速实现,研究人员最近开发了更适合分类球形图像的深度学习方法。这些新提出的卷积层自然会将卷积的概念扩展到单位球上的功能$ s^2 $和一组旋转$ SO(3)$,并且这些层与3D旋转相当。在本文中,我们考虑了对球形图像的旋转不变表示的无监督学习的问题。特别是,我们仔细设计了一个由$ S^2 $和$ SO(3)$卷积层组成的自动编码器体系结构。由于3D旋转通常是一个令人讨厌的因素,因此潜在空间被限制为这些输入转换完全不变。随着旋转信息在潜在空间中丢弃,我们为训练网络制作了一种新颖的旋转损失功能。多个数据集上的广泛实验证明了学习表示对聚类,检索和分类应用程序的有用性。

Omnidirectional images and spherical representations of $3D$ shapes cannot be processed with conventional 2D convolutional neural networks (CNNs) as the unwrapping leads to large distortion. Using fast implementations of spherical and $SO(3)$ convolutions, researchers have recently developed deep learning methods better suited for classifying spherical images. These newly proposed convolutional layers naturally extend the notion of convolution to functions on the unit sphere $S^2$ and the group of rotations $SO(3)$ and these layers are equivariant to 3D rotations. In this paper, we consider the problem of unsupervised learning of rotation-invariant representations for spherical images. In particular, we carefully design an autoencoder architecture consisting of $S^2$ and $SO(3)$ convolutional layers. As 3D rotations are often a nuisance factor, the latent space is constrained to be exactly invariant to these input transformations. As the rotation information is discarded in the latent space, we craft a novel rotation-invariant loss function for training the network. Extensive experiments on multiple datasets demonstrate the usefulness of the learned representations on clustering, retrieval and classification applications.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源