论文标题
使用混合精液技术提高GMRES方法的性能
Improving the Performance of the GMRES Method using Mixed-Precision Techniques
论文作者
论文摘要
GMRE方法用于解决许多科学应用引起的线性方程式稀疏,非对称系统。由于其计算内核的低算术强度,单个节点内的求解器性能是内存绑定的。为了减少数据移动的量,因此,为了提高性能,我们研究了使用单个和双重精度的同时保持双重精度的效果。先前的努力探索了预科器中的精度降低,但是求解器本身中降低的精度的使用受到了有限的关注。我们发现,转基因在计算残差并更新近似解决方案以达到双重精度时只需要双重精度,尽管在每次提高单精度精度后必须重新启动。这一发现适用于经过测试的正交方案:经过改性的革兰氏schmidt(MGS)和经典的革兰氏夹,并具有重新构成(CGSR)。此外,当我们的混合精液至少重新启动一次时,分别比MGS和CGSR的双精度GMRE平均执行19%和24%。我们的实施使用通用编程技术来减轻针对不同数据类型的编码实现的负担。我们对Kokkos库的使用使我们能够利用并行性并优化数据管理。此外,在产生性能结果时也使用了科科斯克内尔斯。总之,在GMRE中使用单一和双重精度的混合物可以提高性能,同时保持双重精度。
The GMRES method is used to solve sparse, non-symmetric systems of linear equations arising from many scientific applications. The solver performance within a single node is memory bound, due to the low arithmetic intensity of its computational kernels. To reduce the amount of data movement, and thus, to improve performance, we investigated the effect of using a mix of single and double precision while retaining double-precision accuracy. Previous efforts have explored reduced precision in the preconditioner, but the use of reduced precision in the solver itself has received limited attention. We found that GMRES only needs double precision in computing the residual and updating the approximate solution to achieve double-precision accuracy, although it must restart after each improvement of single-precision accuracy. This finding holds for the tested orthogonalization schemes: Modified Gram-Schmidt (MGS) and Classical Gram-Schmidt with Re-orthogonalization (CGSR). Furthermore, our mixed-precision GMRES, when restarted at least once, performed 19% and 24% faster on average than double-precision GMRES for MGS and CGSR, respectively. Our implementation uses generic programming techniques to ease the burden of coding implementations for different data types. Our use of the Kokkos library allowed us to exploit parallelism and optimize data management. Additionally, KokkosKernels was used when producing performance results. In conclusion, using a mix of single and double precision in GMRES can improve performance while retaining double-precision accuracy.