论文标题

CAFT:三层关闭数据中心的拥堵感耐断层负载平衡

CAFT: Congestion-Aware Fault-Tolerant Load Balancing for Three-Tier Clos Data Centers

论文作者

Alanazi, Sultan, Hamdaoui, Bechir

论文摘要

生产数据中心在各种工作量大小下运行,从对潜伏期敏感的小鼠流到长寿命的大象流。但是,数据中心网络中的主要负载平衡方案(相等的多路径(ECMP))对路径条件不可知,并且在不对称拓扑中的性能较差,导致吞吐量较低和较高的潜伏期。在本文中,我们提出了CAFT,CAFT是3层数据中心网络的分布式拥堵耐故障负载平衡协议。它首先是从任何两个主机之间的所有可能路径的集合中实时收集两个子集的完整拥塞信息。然后,在运输控制协议(TCP)连接过程中,每个子集中的最佳路径拥塞信息都跨开关进行,以做出路径选择决策。有两个候选路径可以改善CAFT对链接失败引起的不对称性的鲁棒性。大规模NS-3模拟表明,对于对称和非对称方案的平均流程完成时间(FCT)和网络吞吐量的CAFT优于Expeditus。

Production data centers operate under various workload sizes ranging from latency-sensitive mice flows to long-lived elephant flows. However, the predominant load balancing scheme in data center networks, equal-cost multi-path (ECMP), is agnostic to path conditions and performs poorly in asymmetric topologies, resulting in low throughput and high latencies. In this paper, we propose CAFT, a distributed congestion-aware fault-tolerant load balancing protocol for 3-tier data center networks. It first collects, in real time, the complete congestion information of two subsets from the set of all possible paths between any two hosts. Then, the best path congestion information from each subset is carried across the switches, during the Transport Control Protocol (TCP) connection process, to make path selection decision. Having two candidate paths improve the robustness of CAFT to asymmetries caused by link failures. Large-scale ns-3 simulations show that CAFT outperforms Expeditus in mean flow completion time (FCT) and network throughput for both symmetric and asymmetric scenarios.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源