梯度下降是最佳的，在较低限制的速度不平等和上误差限制下

论文标题

梯度下降是最佳的，在较低限制的速度不平等和上误差限制下

Gradient Descent Is Optimal Under Lower Restricted Secant Inequality And Upper Error Bound

论文作者

Guille-Escuret, Charles, Ibrahim, Adam, Goujaud, Baptiste, Mitliagkas, Ioannis

论文摘要

一阶优化的研究对目标函数的假设敏感。这些假设引起的复杂性类别在最坏情况分析中起着关键作用，包括算法最优性的基本概念。最近的工作认为，强烈的凸度和平稳性，文献中的流行假设，导致病态数量的病理定义（Guille-Escuret等，2021）。在此结果的激励下，我们专注于满足较低限制的SCANT不平等和上限界限的功能类别。除了对上述病理行为具有鲁棒性并包括一些非凸功能，这对条件还显示出有趣的几何特性。特别是，可以将一组点及其在同类中的梯度插入的必要条件可以分为每个采样梯度上的简单条件。这允许性能估计问题（PEP，DRORI和TEBOULLE（2012））进行分析解决，从而导致收敛速率下降，这证明梯度下降在所有一阶算法中对此类别的功能非常最佳。

The study of first-order optimization is sensitive to the assumptions made on the objective functions. These assumptions induce complexity classes which play a key role in worst-case analysis, including the fundamental concept of algorithm optimality. Recent work argues that strong convexity and smoothness, popular assumptions in literature, lead to a pathological definition of the condition number (Guille-Escuret et al., 2021). Motivated by this result, we focus on the class of functions satisfying a lower restricted secant inequality and an upper error bound. On top of being robust to the aforementioned pathological behavior and including some non-convex functions, this pair of conditions displays interesting geometrical properties. In particular, the necessary and sufficient conditions to interpolate a set of points and their gradients within the class can be separated into simple conditions on each sampled gradient. This allows the performance estimation problem (PEP, Drori and Teboulle (2012)) to be solved analytically, leading to a lower bound on the convergence rate that proves gradient descent to be exactly optimal on this class of functions among all first-order algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题