恢复回归的近似方案

论文标题

恢复回归的近似方案

Approximation Schemes for ReLU Regression

论文作者

Diakonikolas, Ilias, Goel, Surbhi, Karmalkar, Sushrut, Klivans, Adam R., Soltanolkotabi, Mahdi

论文摘要

我们考虑了Relu回归的基本问题，在这种问题中，目标是输出最佳的拟合relu，以使平方损失从某些未知分布中获取平方损失。假设潜在的分布满足某些弱浓度和抗浓缩条件（例如，例如所有对数符号分布），我们为此问题提供了第一个有效的，恒定的因子近似算法。这解决了Goel等人的主要开放问题，Goel等人证明了任何精确算法回归的硬度结果（最多可添加$ε$）。使用更复杂的技术，我们可以改善结果，并为任何Subgaussian分布获得多项式时间近似方案。鉴于上述硬度结果，这些保证不能显着改善。我们的主要见解是对非凸活化的替代损失的新表征。尽管先前的工作已经确定了单调激活的凸替代物的存在，但我们表明，基本分布的特性实际上会导致损失的强凸度，从而使我们能够将全局最小值与激活的CHOW参数联系起来。

We consider the fundamental problem of ReLU regression, where the goal is to output the best fitting ReLU with respect to square loss given access to draws from some unknown distribution. We give the first efficient, constant-factor approximation algorithm for this problem assuming the underlying distribution satisfies some weak concentration and anti-concentration conditions (and includes, for example, all log-concave distributions). This solves the main open problem of Goel et al., who proved hardness results for any exact algorithm for ReLU regression (up to an additive $ε$). Using more sophisticated techniques, we can improve our results and obtain a polynomial-time approximation scheme for any subgaussian distribution. Given the aforementioned hardness results, these guarantees can not be substantially improved. Our main insight is a new characterization of surrogate losses for nonconvex activations. While prior work had established the existence of convex surrogates for monotone activations, we show that properties of the underlying distribution actually induce strong convexity for the loss, allowing us to relate the global minimum to the activation's Chow parameters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题