论文标题
LB可伸缩性:在状态和无状态之间实现正确的平衡
LB Scalability: Achieving the Right Balance Between Being Stateful and Stateless
论文作者
论文摘要
高性能层4负载平衡器(LB)是云服务基础架构中最重要的组成部分之一。这样的LB使用网络和传输层信息来决定如何在一组服务器上分发客户端请求。对状态LB的关键要求是根据连接一致性(PCC);也就是说,即使服务器或分配函数更改,只要服务器还活着,同一连接的所有数据包都将转发到同一服务器。挑战在于设计一个也是可扩展的高吞吐量,低延迟解决方案。本文提出了使用可编程开关ASIC实施的高度可扩展的LB,称为Prism。据我们所知,Prism是第一个报告的LB,可以在确保PCC的同时每秒处理数百万个连接。这是由于以下事实:即使在服务器池更改期间,棱镜也将所有数据包转发,同时避免了每个活动连接维护硬件状态的需求。我们实施了所提出的体系结构的原型,并表明棱镜可以同时扩展到1亿个连接,并且每秒可以容纳一个以上的池更新。
A high performance Layer-4 load balancer (LB) is one of the most important components of a cloud service infrastructure. Such an LB uses network and transport layer information for deciding how to distribute client requests across a group of servers. A crucial requirement for a stateful LB is per connection consistency (PCC); namely, that all the packets of the same connection will be forwarded to the same server, as long as the server is alive, even if the pool of servers or the assignment function changes. The challenge is in designing a high throughput, low latency solution that is also scalable. This paper proposes a highly scalable LB, called Prism, implemented using a programmable switch ASIC. As far as we know, Prism is the first reported LB that can process millions of connections per second and hundreds of millions connections in total, while ensuring PCC. This is due to the fact that Prism forwards all the packets in hardware, even during server pool changes, while avoiding the need to maintain a hardware state per every active connection. We implemented a prototype of the proposed architecture and showed that Prism can scale to 100 million simultaneous connections, and can accommodate more than one pool update per second.