论文标题
通过可微分的优化器学习
Learning with Differentiable Perturbed Optimizers
论文作者
论文摘要
机器学习管道通常依靠优化程序来做出离散决策(例如,排序,接近最接近的邻居或最短路径)。尽管这些离散的决策很容易计算,但它们破坏了计算图的后传播。为了扩大可以以端到端方式解决的学习问题的范围,我们提出了一种系统的方法,将优化者转换为可区分且从不局部稳定的操作。我们的方法依赖于随机扰动的优化器,并且可以与现有求解器一起轻松使用。可以有效地评估它们的衍生物,并通过选定的噪声幅度调节光滑度。我们还展示了如何将该框架与结构化预测中发展的损失家庭联系起来,并为其在学习任务中的使用提供理论保证。我们通过实验证明我们的方法在各种任务上的表现。
Machine learning pipelines often rely on optimization procedures to make discrete decisions (e.g., sorting, picking closest neighbors, or shortest paths). Although these discrete decisions are easily computed, they break the back-propagation of computational graphs. In order to expand the scope of learning problems that can be solved in an end-to-end fashion, we propose a systematic method to transform optimizers into operations that are differentiable and never locally constant. Our approach relies on stochastically perturbed optimizers, and can be used readily together with existing solvers. Their derivatives can be evaluated efficiently, and smoothness tuned via the chosen noise amplitude. We also show how this framework can be connected to a family of losses developed in structured prediction, and give theoretical guarantees for their use in learning tasks. We demonstrate experimentally the performance of our approach on various tasks.