论文标题

低级收缩的二重性优化:最佳样品复杂性而无需温暖

Bilevel Optimization with a Lower-level Contraction: Optimal Sample Complexity without Warm-start

论文作者

Grazzi, Riccardo, Pontil, Massimiliano, Salzo, Saverio

论文摘要

我们分析了一类二聚体问题,其中高级问题在于平稳的目标函数的最小化和下层问题是找到平滑收缩图的固定点。这种类型的问题包括元学习,平衡模型,超参数优化和数据中毒对抗性攻击的实例。最近的几项作品提出了算法,这些算法温暖了较低的问题,即〜他们使用先前的下级近似解决方案作为低级求解器的凝视点。这种温暖的启动程序使人们可以在随机和确定性设置中提高样品复杂性,在某些情况下可以实现订单的最佳样品复杂性。但是,有些情况,例如元学习和平衡模型,其中温暖的启动程序不适合或无效。在这项工作中,我们表明,如果没有温暖的启动,仍然有可能在订单(接近)最佳样本复杂性方面实现。特别是,我们提出了一种简单的方法,该方法在下层处使用(随机)固定点迭代,并在上层处使用不精确的梯度下降,它使用$ o(ε^{ - 2})$和$ o(ε^{ - 2})$和$ \ tilde {O}(O}(O){O}(ε^{ε^{ - 1} $ samples和STECTACTION)的确定,并确定了STESTIS,并确定了STECTIS,并确定$ o(ε^{ - 2})$(ε^{ - 2})$(ε^{ - 2})$。最后,与使用温暖启动的方法相比,我们的方法产生了更简单的分析,不需要研究上层和下层迭代之间的耦合相互作用。

We analyse a general class of bilevel problems, in which the upper-level problem consists in the minimization of a smooth objective function and the lower-level problem is to find the fixed point of a smooth contraction map. This type of problems include instances of meta-learning, equilibrium models, hyperparameter optimization and data poisoning adversarial attacks. Several recent works have proposed algorithms which warm-start the lower-level problem, i.e.~they use the previous lower-level approximate solution as a staring point for the lower-level solver. This warm-start procedure allows one to improve the sample complexity in both the stochastic and deterministic settings, achieving in some cases the order-wise optimal sample complexity. However, there are situations, e.g., meta learning and equilibrium models, in which the warm-start procedure is not well-suited or ineffective. In this work we show that without warm-start, it is still possible to achieve order-wise (near) optimal sample complexity. In particular, we propose a simple method which uses (stochastic) fixed point iterations at the lower-level and projected inexact gradient descent at the upper-level, that reaches an $ε$-stationary point using $O(ε^{-2})$ and $\tilde{O}(ε^{-1})$ samples for the stochastic and the deterministic setting, respectively. Finally, compared to methods using warm-start, our approach yields a simpler analysis that does not need to study the coupled interactions between the upper-level and lower-level iterates.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源