具有本地保证的自适应梯度方法

论文标题

具有本地保证的自适应梯度方法

Adaptive Gradient Methods with Local Guarantees

论文作者

Lu, Zhou, Xia, Wenhan, Arora, Sanjeev, Hazan, Elad

论文摘要

自适应梯度方法是在机器学习中优化的选择方法，并用于训练最大的深层模型。在本文中，我们研究了学习本地预处理的问题，该问题可能会随着数据沿优化轨迹的变化而发生变化。我们提出了一种自适应梯度方法，该方法具有可证明的自适应遗憾，可以保证与最好的本地预处理。为了获得这种保证，我们证明了在线学习中的一种新的自适应后悔，可以改善以前的自适应在线学习方法。我们在自动选择视觉和语言域中流行的基准测试任务的最佳学习率计划方面证明了我们方法的鲁棒性。无需手动调整学习率时间表，我们的方法可以在一次运行中获得可比且稳定的任务精度作为微调的优化器。

Adaptive gradient methods are the method of choice for optimization in machine learning and used to train the largest deep models. In this paper we study the problem of learning a local preconditioner, that can change as the data is changing along the optimization trajectory. We propose an adaptive gradient method that has provable adaptive regret guarantees vs. the best local preconditioner. To derive this guarantee, we prove a new adaptive regret bound in online learning that improves upon previous adaptive online learning methods. We demonstrate the robustness of our method in automatically choosing the optimal learning rate schedule for popular benchmarking tasks in vision and language domains. Without the need to manually tune a learning rate schedule, our method can, in a single run, achieve comparable and stable task accuracy as a fine-tuned optimizer.

下载PDF全文

下载文献需遵守相关版权规定

论文标题