论文标题

在异质体系结构上建模数据移动性能

Modeling Data Movement Performance on Heterogeneous Architectures

论文作者

Bienz, Amanda, Olson, Luke N., Gropp, William D., Lockhart, Shelby

论文摘要

并行系统上的数据移动成本随机器架构,工作分区和附近的工作而变化很大。准确捕获数据移动成本的性能模型为分析提供了工具,从而可以确定通信瓶颈。现代异质体系结构导致数据移动的差异增加,因为有许多可行的GPU通信路径。在本文中,我们为现代异质体系结构的各种节点通信的各种路径提供了绩效模型,包括gpudirect沟通和复制到CPU之间的权衡。此外,我们利用每个节点的所有可用CPU核心对基于这些模型的节点通信进行了新的优化。最后,我们显示了MPI集体操作的相关性能改进。

The cost of data movement on parallel systems varies greatly with machine architecture, job partition, and nearby jobs. Performance models that accurately capture the cost of data movement provide a tool for analysis, allowing for communication bottlenecks to be pinpointed. Modern heterogeneous architectures yield increased variance in data movement as there are a number of viable paths for inter-GPU communication. In this paper, we present performance models for the various paths of inter-node communication on modern heterogeneous architectures, including the trade-off between GPUDirect communication and copying to CPUs. Furthermore, we present a novel optimization for inter-node communication based on these models, utilizing all available CPU cores per node. Finally, we show associated performance improvements for MPI collective operations.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源