论文标题
学习具有浅神经网络的单索引模型
Learning Single-Index Models with Shallow Neural Networks
论文作者
论文摘要
单个索引模型是应用于未知的输入的未知一维投影的未知单变量``链接''函数所给出的一类函数。这些模型在高维度中尤其重要,当数据可能呈现出低维结构时,学习算法应适应。虽然该模型的几个统计方面(例如恢复相关(一维)子空间的样本复杂性,它们都妥善理解,但它们依赖于利用目标函数的特定结构的定制算法。在这项工作中,我们介绍了一类自然的浅神经网络,并研究了其通过梯度流学习单个指数模型的能力。更确切地说,我们考虑在随机初始化时冻结神经元的偏差的浅网络。我们表明,相应的优化景观是良性的,这又导致概括确保与专用半参数方法的近乎最佳样本复杂性相匹配。
Single-index models are a class of functions given by an unknown univariate ``link'' function applied to an unknown one-dimensional projection of the input. These models are particularly relevant in high dimension, when the data might present low-dimensional structure that learning algorithms should adapt to. While several statistical aspects of this model, such as the sample complexity of recovering the relevant (one-dimensional) subspace, are well-understood, they rely on tailored algorithms that exploit the specific structure of the target function. In this work, we introduce a natural class of shallow neural networks and study its ability to learn single-index models via gradient flow. More precisely, we consider shallow networks in which biases of the neurons are frozen at random initialization. We show that the corresponding optimization landscape is benign, which in turn leads to generalization guarantees that match the near-optimal sample complexity of dedicated semi-parametric methods.