MixML：对弱一致平行学习的统一分析

论文标题

MixML：对弱一致平行学习的统一分析

MixML: A Unified Analysis of Weakly Consistent Parallel Learning

论文作者

Lu, Yucheng, Nash, Jack, De Sa, Christopher

论文摘要

并行性是一种无处不在的方法，用于加速机器学习算法。但是，平行学习的理论分析通常是在算法和协议特定环境中进行的，几乎没有深入了解交流结构的变化如何影响收敛。在本文中，我们提出了MixML，这是一个用于分析弱一致的并行机器学习收敛性的一般框架。我们的框架包括：（1）对平行工人之间的通信过程进行建模的统一方法；（2）一个新参数，即混合时间TMIX，该参数量化了通信过程如何影响收敛；（3）将顺序算法转换为仅取决于TMIX的并行版本的一种原则性方法。我们显示了许多算法的异步和/或分散版本的混合恢复和改进，包括GGD和AMSGRAD。我们的实验证实了理论，并显示了收敛对基本混合时间的依赖性。

Parallelism is a ubiquitous method for accelerating machine learning algorithms. However, theoretical analysis of parallel learning is usually done in an algorithm- and protocol-specific setting, giving little insight about how changes in the structure of communication could affect convergence. In this paper we propose MixML, a general framework for analyzing convergence of weakly consistent parallel machine learning. Our framework includes: (1) a unified way of modeling the communication process among parallel workers; (2) a new parameter, the mixing time tmix, that quantifies how the communication process affects convergence; and (3) a principled way of converting a convergence proof for a sequential algorithm into one for a parallel version that depends only on tmix. We show MixML recovers and improves on known convergence bounds for asynchronous and/or decentralized versions of many algorithms, includingSGD and AMSGrad. Our experiments substantiate the theory and show the dependency of convergence on the underlying mixing time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题