学习多项式转换

论文标题

学习多项式转换

Learning Polynomial Transformations

论文作者

Chen, Sitan, Li, Jerry, Li, Yuanzhi, Zhang, Anru R.

论文摘要

我们考虑学习高斯人的高维多项式变换的问题。给定$ p（x）$的样本，其中$ x \ sim n（0，\ mathrm {id} _r）$是隐藏的，$ p：\ mathbb {r}^r \ to \ mathbb {r}^d $是一个函数，在每个输出坐标是一个低级poartemial，the Amport the Amports the Amports the Amports $ p（x）$ p（x）$ p（x）$ p（x）。这个问题本身就是自然的，但也是学习深层生成模型的重要特殊案例，即在具有多项式激活的两层神经网络下的高斯人的推动。了解这种生成模型的可学习性对于理解它们在实践中的表现如此出色至关重要。我们的第一个主要结果是一种多项式时间算法，用于在平滑环境中学习高斯人的二次变换。我们的第二个主要结果是一种多项式时间算法，用于在平滑环境中学习高斯的恒定度多项式变换，当时相关张量的等级很小。实际上，我们的结果扩展到任何旋转不变的输入分布，而不仅仅是高斯。这些是在具有多个层的神经网络下学习推动力的第一到端保证。在此过程中，我们还提供了第一个多项式时算法，并提供可证明的张量环分解的保证，这是对张量分解的流行概括，用于实践中，用于隐式存储大型张量。

We consider the problem of learning high dimensional polynomial transformations of Gaussians. Given samples of the form $p(x)$, where $x\sim N(0, \mathrm{Id}_r)$ is hidden and $p: \mathbb{R}^r \to \mathbb{R}^d$ is a function where every output coordinate is a low-degree polynomial, the goal is to learn the distribution over $p(x)$. This problem is natural in its own right, but is also an important special case of learning deep generative models, namely pushforwards of Gaussians under two-layer neural networks with polynomial activations. Understanding the learnability of such generative models is crucial to understanding why they perform so well in practice. Our first main result is a polynomial-time algorithm for learning quadratic transformations of Gaussians in a smoothed setting. Our second main result is a polynomial-time algorithm for learning constant-degree polynomial transformations of Gaussian in a smoothed setting, when the rank of the associated tensors is small. In fact our results extend to any rotation-invariant input distribution, not just Gaussian. These are the first end-to-end guarantees for learning a pushforward under a neural network with more than one layer. Along the way, we also give the first polynomial-time algorithms with provable guarantees for tensor ring decomposition, a popular generalization of tensor decomposition that is used in practice to implicitly store large tensors.

下载PDF全文

下载文献需遵守相关版权规定

论文标题