论文标题
早期镜下降的统计复杂性
The Statistical Complexity of Early-Stopped Mirror Descent
论文作者
论文摘要
最近,人们对理解基于迭代梯度的优化算法的隐式正则性能的兴趣激增。在本文中,我们研究了通过将线性模型和内核方法的平方损失施加到未注册的经验风险上的早期不受限制的无镜下降算法所带来的多余风险的统计保证。通过完成表征平方损耗的凸度的不平等,我们确定了偏移rademacher复杂性与镜下降方法的潜在收敛分析之间的内在联系。我们的观察结果立即产生由镜下迭代的路径的多余的风险保证,这仅取决于某些功能类别的偏移复杂性,这仅取决于镜像映射,初始化点,台阶和迭代次数的选择。我们利用我们的理论以简洁而优雅的方式通过相当简短的证据恢复,这是隐式正规化文献中最近的一些结果,同时还显示了如何在某些情况下对它们进行改进。
Recently there has been a surge of interest in understanding implicit regularization properties of iterative gradient-based optimization algorithms. In this paper, we study the statistical guarantees on the excess risk achieved by early-stopped unconstrained mirror descent algorithms applied to the unregularized empirical risk with the squared loss for linear models and kernel methods. By completing an inequality that characterizes convexity for the squared loss, we identify an intrinsic link between offset Rademacher complexities and potential-based convergence analysis of mirror descent methods. Our observation immediately yields excess risk guarantees for the path traced by the iterates of mirror descent in terms of offset complexities of certain function classes depending only on the choice of the mirror map, initialization point, step-size, and the number of iterations. We apply our theory to recover, in a clean and elegant manner via rather short proofs, some of the recent results in the implicit regularization literature, while also showing how to improve upon them in some settings.