论文标题

MixML:对弱一致平行学习的统一分析

MixML: A Unified Analysis of Weakly Consistent Parallel Learning

论文作者

Lu, Yucheng, Nash, Jack, De Sa, Christopher

论文摘要

并行性是一种无处不在的方法,用于加速机器学习算法。但是,平行学习的理论分析通常是在算法和协议特定环境中进行的,几乎没有深入了解交流结构的变化如何影响收敛。在本文中,我们提出了MixML,这是一个用于分析弱一致的并行机器学习收敛性的一般框架。我们的框架包括:(1)对平行工人之间的通信过程进行建模的统一方法; (2)一个新参数,即混合时间TMIX,该参数量化了通信过程如何影响收敛; (3)将顺序算法转换为仅取决于TMIX的并行版本的一种原则性方法。我们显示了许多算法的异步和/或分散版本的混合恢复和改进,包括GGD和AMSGRAD。我们的实验证实了理论,并显示了收敛对基本混合时间的依赖性。

Parallelism is a ubiquitous method for accelerating machine learning algorithms. However, theoretical analysis of parallel learning is usually done in an algorithm- and protocol-specific setting, giving little insight about how changes in the structure of communication could affect convergence. In this paper we propose MixML, a general framework for analyzing convergence of weakly consistent parallel machine learning. Our framework includes: (1) a unified way of modeling the communication process among parallel workers; (2) a new parameter, the mixing time tmix, that quantifies how the communication process affects convergence; and (3) a principled way of converting a convergence proof for a sequential algorithm into one for a parallel version that depends only on tmix. We show MixML recovers and improves on known convergence bounds for asynchronous and/or decentralized versions of many algorithms, includingSGD and AMSGrad. Our experiments substantiate the theory and show the dependency of convergence on the underlying mixing time.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源