论文标题

自适应随机方差降低非凸有限和最小化

Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization

论文作者

Kavis, Ali, Skoulakis, Stratis, Antonakopoulos, Kimon, Dadi, Leello Tadesse, Cevher, Volkan

论文摘要

我们提出了一种称为Adaspider的自适应差异方法,以最大程度地减少$ l $ -Smooth,具有有限和结构的非凸函数。从本质上讲,Adaspider结合了Adagrad启发的[Duchi等,2011; McMahan&Streeter,2010年],但相当独特的,适应性的步进时间表与[Fang等人,2018年]中提出的递归随机路径的集成估计器。据我们所知,Adaspider是第一个无参数的非convex方差减少方法,因为它不需要了解问题依赖性参数的知识,例如平滑度常数$ l $,目标准确性$ε$或任何对梯度规范的限制。在此过程中,我们能够使用$ \ tilde {o} \ left(n + \ sqrt {n}/ε^2 \ right)$ oracle-calls计算一个$ε$ - 固定点,与对数因子相匹配。

We propose an adaptive variance-reduction method, called AdaSpider, for minimization of $L$-smooth, non-convex functions with a finite-sum structure. In essence, AdaSpider combines an AdaGrad-inspired [Duchi et al., 2011, McMahan & Streeter, 2010], but a fairly distinct, adaptive step-size schedule with the recursive stochastic path integrated estimator proposed in [Fang et al., 2018]. To our knowledge, Adaspider is the first parameter-free non-convex variance-reduction method in the sense that it does not require the knowledge of problem-dependent parameters, such as smoothness constant $L$, target accuracy $ε$ or any bound on gradient norms. In doing so, we are able to compute an $ε$-stationary point with $\tilde{O}\left(n + \sqrt{n}/ε^2\right)$ oracle-calls, which matches the respective lower bound up to logarithmic factors.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源