论文标题

非凸优化的随机梯度方法的适应性

Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization

论文作者

Horváth, Samuel, Lei, Lihua, Richtárik, Peter, Jordan, Michael I.

论文摘要

适应性是现代优化理论中重要但研究不足的特性。最先进的理论与当前实践之间的差距令人震惊,因为具有理想的理论保证的算法通常涉及在不同的制度下,诸如阶梯尺寸方案和批量大小的超标剂的截然不同的环境。尽管有了具有吸引力的理论结果,但这种分裂策略几乎没有给实践者选择算法,而无需调整超参数而广泛使用的算法。在这项工作中,将Lei&Jordan 2016介绍的“几何化”技术和Nguyen等人的\ texttt {sarah}算法,2017年,我们提出了用于非convex Fimavex Fimavex Fimate-Filite-Sum-Sum-Sum-Sum-sum-sum-sum-sum-sum-sum-sum-sum-sum-sum-sum-sum-sum-sum and Stochatical and Stochicastic-Sochostical和Stonchostical和Stochostical和Stochostical和Stochostical和Stochostical和结构化的算法\ texttt {sarah}算法。如果存在,我们的算法被证明可以适应目标准确性的大小和polyak-lojasiewicz(PL)常数(如果存在)。此外,它可以同时实现非PL目标的最佳收敛率,同时超过了PL目标的现有算法。

Adaptivity is an important yet under-studied property in modern optimization theory. The gap between the state-of-the-art theory and the current practice is striking in that algorithms with desirable theoretical guarantees typically involve drastically different settings of hyperparameters, such as step-size schemes and batch sizes, in different regimes. Despite the appealing theoretical results, such divisive strategies provide little, if any, insight to practitioners to select algorithms that work broadly without tweaking the hyperparameters. In this work, blending the "geometrization" technique introduced by Lei & Jordan 2016 and the \texttt{SARAH} algorithm of Nguyen et al., 2017, we propose the Geometrized \texttt{SARAH} algorithm for non-convex finite-sum and stochastic optimization. Our algorithm is proved to achieve adaptivity to both the magnitude of the target accuracy and the Polyak-Łojasiewicz (PL) constant if present. In addition, it achieves the best-available convergence rate for non-PL objectives simultaneously while outperforming existing algorithms for PL objectives.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源