论文标题
脂肪树网络中的高质量断层效能(扩展摘要)
High-Quality Fault-Resiliency in Fat-Tree Networks (Extended Abstract)
论文作者
论文摘要
将常规拓扑与优化的路由算法耦合是推动HPC系统互连网络的性能的关键。在本文中,我们介绍了DMODC,这是一种平行的广义脂肪树(PGFTS)的快速确定性路由算法,即使在设备故障引起的大规模拓扑降解下,也可以最大程度地降低拥塞风险。它仅使用pre-modulo部门的子树知识,对开关之间的转发表进行了基于模量的转发表的计算。 DMODC允许在不到一秒钟的时间内完全重新布线,并在不到一秒钟内具有数万个节点的拓扑结构,这极大地有助于集中式的织物管理对具有高质量路由表的故障做出反应,并且对当前和未来的非常大规模的HPC簇中的运行应用没有影响。我们将DMODC与Infiniband Control软件(OPENSM)中可用的路由算法进行比较,首先是路由执行时间以显示大小的可行性,然后在降级下进行拥塞风险以证明稳健性。后一个比较是使用在随机排列(RP),移位排列(SP)和全能(A2A)流量模式下的路由表进行静态分析进行的。 DMODC的结果表明,A2A和RP拥塞的风险在重降解下与比较最稳定的算法相似,而接近最佳的SP充血风险最多可占随机降解的1%。
Coupling regular topologies with optimized routing algorithms is key in pushing the performance of interconnection networks of HPC systems. In this paper we present Dmodc, a fast deterministic routing algorithm for Parallel Generalized Fat-Trees (PGFTs) which minimizes congestion risk even under massive topology degradation caused by equipment failure. It applies a modulo-based computation of forwarding tables among switches closer to the destination, using only knowledge of subtrees for pre-modulo division. Dmodc allows complete rerouting of topologies with tens of thousands of nodes in less than a second, which greatly helps centralized fabric management react to faults with high-quality routing tables and no impact to running applications in current and future very large-scale HPC clusters. We compare Dmodc against routing algorithms available in the InfiniBand control software (OpenSM) first for routing execution time to show feasibility at scale, and then for congestion risk under degradation to demonstrate robustness. The latter comparison is done using static analysis of routing tables under random permutation (RP), shift permutation (SP) and all-to-all (A2A) traffic patterns. Results for Dmodc show A2A and RP congestion risks similar under heavy degradation as the most stable algorithms compared, and near-optimal SP congestion risk up to 1% of random degradation.