hetpipe：通过集成管道模型并行性，实现了（Whimpy）异质GPU群集的大型DNN培训

论文标题

hetpipe：通过集成管道模型并行性，实现了（Whimpy）异质GPU群集的大型DNN培训

HetPipe: Enabling Large DNN Training on (Whimpy) Heterogeneous GPU Clusters through Integration of Pipelined Model Parallelism and Data Parallelism

论文作者

Park, Jay H., Yun, Gyeongchan, Yi, Chang M., Nguyen, Nguyen T., Lee, Seungmin, Choi, Jaesik, Noh, Sam H., Choi, Young-ri

论文摘要

深度神经网络（DNN）模型的规模不断增长，以提高模型的准确性和质量。此外，对于大型DNN模型的培训，由于新的GPU体系结构的释放周期短，因此不可避免地使用异质GPU。在本文中，我们调查了如何在异质GPU群集上启用大型DNN模型的培训，该模型可能包括奇怪的GPU，作为独立的GPU，无法用于培训。我们提出了DNN训练系统Hetpipe（异质管道），该系统将管道模型并行性（PMP）与数据并行性（DP）集成在一起。在Hetpipe中，一组称为虚拟工人的多个GPU以管道方式处理小匹配，而多个此类虚拟工人则采用数据并行性来提高性能。我们还提出了一个新型的参数同步模型，我们称之为波同步平行（WSP），以适应虚拟工人的PMP和DP，并提供WSP的收敛证明。我们在给定的异质设置上的实验结果表明，与最先进的DP技术相比，DNN模型的收敛速度快49％。

Deep Neural Network (DNN) models have continuously been growing in size in order to improve the accuracy and quality of the models. Moreover, for training of large DNN models, the use of heterogeneous GPUs is inevitable due to the short release cycle of new GPU architectures. In this paper, we investigate how to enable training of large DNN models on a heterogeneous GPU cluster that possibly includes whimpy GPUs that, as a standalone, could not be used for training. We present a DNN training system, HetPipe (Heterogeneous Pipeline), that integrates pipelined model parallelism (PMP) with data parallelism (DP). In HetPipe, a group of multiple GPUs, called a virtual worker, processes minibatches in a pipelined manner, and multiple such virtual workers employ data parallelism for higher performance. We also propose a novel parameter synchronization model, which we refer to as Wave Synchronous Parallel (WSP) to accommodate both PMP and DP for virtual workers, and provide convergence proof of WSP. Our experimental results on a given heterogeneous setting show that with HetPipe, DNN models converge up to 49% faster compared to the state-of-the-art DP technique.

下载PDF全文

下载文献需遵守相关版权规定

论文标题