论文标题
延迟在Multigrid中延迟近似矩阵组件,并具有动态精度
Delayed approximate matrix assembly in multigrid with dynamic precisions
论文作者
论文摘要
系统矩阵的准确组件是任何求解网格上偏微分方程的代码中的重要步骤。我们要么明确设置一个矩阵,要么在无基质环境中工作,必须能够根据需要快速返回矩阵条目。无论哪种方式,由于非平凡的材料参数进入方程,构造的成本都可能变得昂贵,而需要彼此依赖的级数或动态自适应网格细化需要矩阵的层压代码,这些矩阵需要矩阵条目或整个方程系统的重新占用。我们建议这些结构可以与多族循环同时执行。初始的几何矩阵和低精度集成开始了多机,而改进的组装数据则在可用时将其馈送到求解器。由于我们消除了传统上延迟实际计算的昂贵准备阶段,解决方案的时间得到了改善。我们消除了算法延迟。此外,我们从解决方案过程中使组件取消组装。并发水平的无政府状态增加可提高可扩展性。众所周知,组装例程是记忆和带宽单程。当我们与迭代改进的运算符精度合作时,我们最终提出了使用层次结构的有损压缩方案的使用,以便在系统矩阵条目几乎没有信息或尚未高精度可用的情况下积极地将内存足迹放下。
The accurate assembly of the system matrix is an important step in any code that solves partial differential equations on a mesh. We either explicitly set up a matrix, or we work in a matrix-free environment where we have to be able to quickly return matrix entries upon demand. Either way, the construction can become costly due to non-trivial material parameters entering the equations, multigrid codes requiring cascades of matrices that depend upon each other, or dynamic adaptive mesh refinement that necessitates the recomputation of matrix entries or the whole equation system throughout the solve. We propose that these constructions can be performed concurrently with the multigrid cycles. Initial geometric matrices and low accuracy integrations kickstart the multigrid, while improved assembly data is fed to the solver as and when it becomes available. The time to solution is improved as we eliminate an expensive preparation phase traditionally delaying the actual computation. We eliminate algorithmic latency. Furthermore, we desynchronise the assembly from the solution process. This anarchic increase of the concurrency level improves the scalability. Assembly routines are notoriously memory- and bandwidth-demanding. As we work with iteratively improving operator accuracies, we finally propose the use of a hierarchical, lossy compression scheme such that the memory footprint is brought down aggressively where the system matrix entries carry little information or are not yet available with high accuracy.