论文标题

传统和数据并联原始实现的可视化和分析内核的绩效分析

Performance Analysis of Traditional and Data-Parallel Primitive Implementations of Visualization and Analysis Kernels

论文作者

Bethel, E. Wes, Camp, David, Perciano, Talita, Heinemann, Colleen

论文摘要

在研究并行可视化和分析方法的计算平台上,对同步和复杂性的计算平台进行了平行可视化和分析方法时,绝对运行时的测量可作为性能的摘要。我们可以通过从硬件性能计数器中测量和检查更详细的措施来获得更多的见解,例如以特定方式实施的算法执行的指令数量,通过缓存命中/miss比率从内存层次结构使用级别移动到内存,内存层次结构利用率等等。这项工作着重于以不同方式实现的三种不同可视化和分析内核的现代多核平台的性能分析:一种是使用C ++和VTK组合的“传统”,另一个使用VTK-M使用数据平行的方法。我们的性能研究包括在两个不同的多核CPU平台上对几个不同的硬件性能计数器进行测量和报告。结果揭示了这两种不同的实现这些内核方法之间有趣的性能差异,结果将不明显使用运行时作为唯一的度量。

Measurements of absolute runtime are useful as a summary of performance when studying parallel visualization and analysis methods on computational platforms of increasing concurrency and complexity. We can obtain even more insights by measuring and examining more detailed measures from hardware performance counters, such as the number of instructions executed by an algorithm implemented in a particular way, the amount of data moved to/from memory, memory hierarchy utilization levels via cache hit/miss ratios, and so forth. This work focuses on performance analysis on modern multi-core platforms of three different visualization and analysis kernels that are implemented in different ways: one is "traditional", using combinations of C++ and VTK, and the other uses a data-parallel approach using VTK-m. Our performance study consists of measurement and reporting of several different hardware performance counters on two different multi-core CPU platforms. The results reveal interesting performance differences between these two different approaches for implementing these kernels, results that would not be apparent using runtime as the only metric.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源