监督机器学习的优化：数据和参数的随机算法

论文标题

监督机器学习的优化：数据和参数的随机算法

Optimization for Supervised Machine Learning: Randomized Algorithms for Data and Parameters

论文作者

Hanzely, Filip

论文摘要

机器学习和数据科学中的许多关键问题通常被用作优化问题，并通过优化算法解决。随着用于制定这些经常不良条件的优化任务的统计模型的数据量以及统计模型的大小和复杂性的增加，需要有能够应对这些挑战的新有效算法。在本论文中，我们以不同的方式处理了这些难度的每个来源。为了有效地解决大数据问题，我们开发了新的方法，在每次迭代中，这些方法仅检查培训数据的一个少量随机子集。为了处理大型模型问题，我们开发了每次迭代中仅更新模型参数的随机子集的方法。最后，为了处理条件不足的问题，我们设计了结合高阶信息或Nesterov的加速/动量的方法。在所有情况下，随机性都被视为一种强大的算法工具，我们在理论上还是在实验中调节，以实现最佳结果。我们的算法通过正规经验风险最小化在培训监督机器学习模型中的主要应用，这是培训此类模型的主要范式。但是，由于它们的普遍性，我们的方法可以应用于许多其他领域，包括但不限于数据科学，工程，科学计算和统计数据。

Many key problems in machine learning and data science are routinely modeled as optimization problems and solved via optimization algorithms. With the increase of the volume of data and the size and complexity of the statistical models used to formulate these often ill-conditioned optimization tasks, there is a need for new efficient algorithms able to cope with these challenges. In this thesis, we deal with each of these sources of difficulty in a different way. To efficiently address the big data issue, we develop new methods which in each iteration examine a small random subset of the training data only. To handle the big model issue, we develop methods which in each iteration update a random subset of the model parameters only. Finally, to deal with ill-conditioned problems, we devise methods that incorporate either higher-order information or Nesterov's acceleration/momentum. In all cases, randomness is viewed as a powerful algorithmic tool that we tune, both in theory and in experiments, to achieve the best results. Our algorithms have their primary application in training supervised machine learning models via regularized empirical risk minimization, which is the dominant paradigm for training such models. However, due to their generality, our methods can be applied in many other fields, including but not limited to data science, engineering, scientific computing, and statistics.

下载PDF全文

下载文献需遵守相关版权规定

论文标题