论文标题
通用几何学习的低维不变嵌入
Low Dimensional Invariant Embeddings for Universal Geometric Learning
论文作者
论文摘要
本文研究了将不变的:$ d $尺寸域上的映射:适当的小组动作不变,哪些分开的轨道是不变的。这项研究的动机源于将不变性分开在证明e夫神经网络体系结构的普遍性方面的有用性。 我们观察到,在某些情况下,分离机器学习文献中建议的不变性的基础性要比尺寸$ d $大得多。结果,基于这些分离不变的理论通用结构是不切实际的。本文我们的目标是解决这个问题。 我们表明,当有一个连续的半代数分离不变的家族可用时,可以通过随机选择这些不变的$ 2D+1 $来获得分离。我们应用这种方法来获得一个有效的方案,用于计算分离不变的小组动作的不变性,这些方案已在不变的学习文献中研究。示例包括通过排列,旋转和其他各种线性组对点云上的矩阵乘法动作。 通常,不变分离的要求是放松的,只需要通用的分离。在这种情况下,我们表明只需要$ D+1 $不变。更重要的是,通用不变量通常会明显更容易计算,正如我们通过讨论加权图的通用和完整分离来说明的那样。最后,我们概述了一种方法,即当随机参数具有有限的精度时,也可以构建分离不变的方法。
This paper studies separating invariants: mappings on $D$ dimensional domains which are invariant to an appropriate group action, and which separate orbits. The motivation for this study comes from the usefulness of separating invariants in proving universality of equivariant neural network architectures. We observe that in several cases the cardinality of separating invariants proposed in the machine learning literature is much larger than the dimension $D$. As a result, the theoretical universal constructions based on these separating invariants is unrealistically large. Our goal in this paper is to resolve this issue. We show that when a continuous family of semi-algebraic separating invariants is available, separation can be obtained by randomly selecting $2D+1 $ of these invariants. We apply this methodology to obtain an efficient scheme for computing separating invariants for several classical group actions which have been studied in the invariant learning literature. Examples include matrix multiplication actions on point clouds by permutations, rotations, and various other linear groups. Often the requirement of invariant separation is relaxed and only generic separation is required. In this case, we show that only $D+1$ invariants are required. More importantly, generic invariants are often significantly easier to compute, as we illustrate by discussing generic and full separation for weighted graphs. Finally we outline an approach for proving that separating invariants can be constructed also when the random parameters have finite precision.