论文标题
Skampi-openshmem:测量OpenSHMEM通信例程
SKaMPI-OpenSHMEM: Measuring OpenSHMEM Communication Routines
论文作者
论文摘要
在HPC中,基准测试是一个重要的挑战,即能够调整应用程序使用的软件环境的基本块。通信库和分布式运行时环境是最关键的环境之一。特别是,可以使用缓冲区大小和通信算法等参数调整通信库提供的许多例程。结果,能够准确衡量这些例程所花费的时间至关重要,对于优化它们并实现最佳性能。例如,Skampi库旨在测量MPI例程所花费的时间,并依靠MPI的双面通信模型来测量单方面和双向点对点通信和集体例程。在本文中,我们讨论了Openshmem的通信模型所特有的基准测试挑战,主要是为了避免在测量其例程所花费的时间时进行管道间的管道和重叠。为此,我们将Skampi扩展为OpenSHMEM,并展示了在实践中解决OpenShmem的通信模型的测量算法。缩放实验是在峰会平台上运行的,以比较Skampi基准操作上的不同基准测试方法。这些显示了我们技术的优势,以进行更准确的性能表征。
Benchmarking is an important challenge in HPC, in particular, to be able to tune the basic blocks of the software environment used by applications. The communication library and distributed run-time environment are among the most critical ones. In particular, many of the routines provided by communication libraries can be adjusted using parameters such as buffer sizes and communication algorithm. As a consequence, being able to measure accurately the time taken by these routines is crucial in order to optimize them and achieve the best performance. For instance, the SKaMPI library was designed to measure the time taken by MPI routines, relying on MPI's two-sided communication model to measure one-sided and two-sided peer-to-peer communication and collective routines. In this paper, we discuss the benchmarking challenges specific to OpenSHMEM's communication model, mainly to avoid inter-call pipelining and overlapping when measuring the time taken by its routines. We extend SKaMPI for OpenSHMEM for this purpose and demonstrate measurement algorithms that address OpenSHMEM's communication model in practice. Scaling experiments are run on the Summit platform to compare different benchmarking approaches on the SKaMPI benchmark operations. These show the advantages of our techniques for more accurate performance characterization.