论文标题

异步负载平衡和自动缩放:平均场限制和最佳设计

Asynchronous Load Balancing and Auto-scaling: Mean-Field Limit and Optimal Design

论文作者

Anselmi, Jonatha

论文摘要

我们为负载平衡开发了一个马尔可夫框架,该框架结合了经典算法,例如$ d $ $ d $和自动缩放机制,允许净服务能力在与工作动力学相同的时间表上响应当前的负载来扩展或向下扩展。我们的框架灵感来自无服务器平台,例如Knative,在该平台中,服务器是软件函数,可以根据无服务器平台用户定义的缩放规则在毫秒中灵活实例化。主要的问题是如何设计这样的扩展规则,以最大程度地减少用户感知的延迟性能,同时确保低能消耗。当自动缩放和负载平衡过程异步(或主动)(如Knative中)时,我们首次研究了这个问题。与同步(或反应性)范式相反,异步使工作不一定需要等待任何时间做出扩大决定。 在我们的主要结果中,我们发现了缩放规则的结构的一般条件,能够驱动平均场动力学以延迟和相对能量最优性,即,在这种情况下,用户感知到的延迟和iDLE服务器引起的相对能量浪费均在限制中消失的情况下,网络需求以命名服务能力成比例地增长到无穷大。当且仅当平均需求超过服务器变得闲置和活动率时,确定的条件建议扩大当前净容量。最后,我们提出了一个符合我们最佳条件的规则规则。数值模拟表明,这些规则比现有同步自动缩放方案提供了更好的延迟性能,同时诱发了几乎相同的功耗。

We develop a Markovian framework for load balancing that combines classical algorithms such as Power-of-$d$ with auto-scaling mechanisms that allow the net service capacity to scale up or down in response to the current load on the same timescale as job dynamics. Our framework is inspired by serverless platforms, such as Knative, where servers are software functions that can be flexibly instantiated in milliseconds according to scaling rules defined by the users of the serverless platform. The main question is how to design such scaling rules to minimize user-perceived delay performance while ensuring low energy consumption. For the first time, we investigate this problem when the auto-scaling and load balancing processes operate asynchronously (or proactively), as in Knative. In contrast to the synchronous (or reactive) paradigm, asynchronism brings the advantage that jobs do not necessarily need to wait any time a scale-up decision is taken. In our main result, we find a general condition on the structure of scaling rules able to drive mean-field dynamics to delay and relative energy optimality, i.e., a situation where both the user-perceived delay and the relative energy waste induced by idle servers vanish in the limit where the network demand grows to infinity in proportion to the nominal service capacity. The identified condition suggests to scale up the current net capacity if and only if the mean demand exceeds the rate at which servers become idle and active. Finally, we propose a family of scaling rules that satisfy our optimality condition. Numerical simulations demonstrate that these rules provide better delay performance than existing synchronous auto-scaling schemes while inducing almost the same power consumption.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源