估计非线性回归的随机线性组合有效且可扩展

论文标题

估计非线性回归的随机线性组合有效且可扩展

Estimating Stochastic Linear Combination of Non-linear Regressions Efficiently and Scalably

论文作者

Wang, Di, Guo, Xiangyu, Guan, Chaowen, Li, Shi, Xu, Jinhui

论文摘要

最近，许多机器学习和统计模型，例如非线性回归，单个索引，多指数，变化系数索引模型和两层神经网络，也可以简化或被视为新模型的特殊情况，该模型称为\ textIt {非线性回归的定位线性线性组合}模型。但是，由于该问题的非跨性别性很高，因此没有以前的工作研究如何估计该模型。在本文中，我们提供了有关如何有效且可稳定地估算模型的第一个研究。具体而言，我们首先表明，如果有一些温和的假设，如果Variate vector $ x $是多元高斯，那么有一种算法的输出矢量具有$ \ ell_2 $ -norm估计$ o（\ sqrt {\ sqrt {\ frac {p} {p} {n}} $ n $ p $ p $ p $ n dimention $ n dimential $样品。证明的关键思想是基于斯坦因的引理动机的观察。然后，我们将结果扩展到$ x $使用零偏置转换的限制和次高斯的情况，这可以看作是经典Stein的引理的概括。我们还表明，有了一些其他假设，有一个算法，其输出向量具有$ o（\ frac {1} {\ sqrt {p}}}+\ sqrt {\ sqrt {\ sqrt {\ frac {\ frac {p} {n} {n} {n} {我们还提供了一个具体的示例，以表明存在一些满足先前假设的链接函数。最后，对于高斯和次高斯的情况，我们提出了一种基于子采样算法的速度，并表明当子样本大小足够大时，那么估计误差就不会被太多牺牲。两种情况的实验都支持我们的理论结果。据我们所知，这是研究并为非线性回归模型的随机线性组合提供理论保证的第一项工作。

Recently, many machine learning and statistical models such as non-linear regressions, the Single Index, Multi-index, Varying Coefficient Index Models and Two-layer Neural Networks can be reduced to or be seen as a special case of a new model which is called the \textit{Stochastic Linear Combination of Non-linear Regressions} model. However, due to the high non-convexity of the problem, there is no previous work study how to estimate the model. In this paper, we provide the first study on how to estimate the model efficiently and scalably. Specifically, we first show that with some mild assumptions, if the variate vector $x$ is multivariate Gaussian, then there is an algorithm whose output vectors have $\ell_2$-norm estimation errors of $O(\sqrt{\frac{p}{n}})$ with high probability, where $p$ is the dimension of $x$ and $n$ is the number of samples. The key idea of the proof is based on an observation motived by the Stein's lemma. Then we extend our result to the case where $x$ is bounded and sub-Gaussian using the zero-bias transformation, which could be seen as a generalization of the classic Stein's lemma. We also show that with some additional assumptions there is an algorithm whose output vectors have $\ell_\infty$-norm estimation errors of $O(\frac{1}{\sqrt{p}}+\sqrt{\frac{p}{n}})$ with high probability. We also provide a concrete example to show that there exists some link function which satisfies the previous assumptions. Finally, for both Gaussian and sub-Gaussian cases we propose a faster sub-sampling based algorithm and show that when the sub-sample sizes are large enough then the estimation errors will not be sacrificed by too much. Experiments for both cases support our theoretical results. To the best of our knowledge, this is the first work that studies and provides theoretical guarantees for the stochastic linear combination of non-linear regressions model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题