通过Pac-Bayes结合的方式优化有效的高参数

论文标题

通过Pac-Bayes结合的方式优化有效的高参数

Efficient hyperparameter optimization by way of PAC-Bayes bound minimization

论文作者

Cherian, John J., Taube, Andrew G., McGibbon, Robert T., Angelikopoulos, Panagiotis, Blanc, Guy, Snarski, Michael, Richman, Daniel D., Klepeis, John L., Shaw, David E.

论文摘要

鉴于高维超参数集的最佳值是一个问题，鉴于其对大规模机器学习应用（例如神经体系结构搜索）的重要性，因此受到了越来越多的关注。最近开发的优化方法可用于选择数千甚至数百万个超参数。但是，这种方法通常会产生过度合适的模型，导致在看不见的数据上的性能差。我们认为，使用标准的超参数优化目标函数而导致这种过度拟合。在这里，我们提出了一个替代目标，相当于可能在预期的样本外误差上绑定的大约正确的bayes（pac-bayes）。然后，我们设计了一种有效的基于梯度的算法来最大程度地减少该目标。所提出的方法的渐近空间和时间复杂性等于或优于其他基于梯度的高参数优化方法。我们表明，当应用于已知容易过度拟合的高参数优化问题时，这种新方法会大大减少样本外误差。

Identifying optimal values for a high-dimensional set of hyperparameters is a problem that has received growing attention given its importance to large-scale machine learning applications such as neural architecture search. Recently developed optimization methods can be used to select thousands or even millions of hyperparameters. Such methods often yield overfit models, however, leading to poor performance on unseen data. We argue that this overfitting results from using the standard hyperparameter optimization objective function. Here we present an alternative objective that is equivalent to a Probably Approximately Correct-Bayes (PAC-Bayes) bound on the expected out-of-sample error. We then devise an efficient gradient-based algorithm to minimize this objective; the proposed method has asymptotic space and time complexity equal to or better than other gradient-based hyperparameter optimization methods. We show that this new method significantly reduces out-of-sample error when applied to hyperparameter optimization problems known to be prone to overfitting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题