论文标题

Riscless:一种增强学习策略,以利用未使用的云资源

RISCLESS: A Reinforcement Learning Strategy to Exploit Unused Cloud Resources

论文作者

Yalles, Sidahmed, Handaoui, Mohamed, Dartois, Jean-Emile, Barais, Olivier, d'Orazio, Laurent, Boukhobza, Jalil

论文摘要

云提供商(CP)的主要目标之一是保证客户的服务级协议(SLA),同时降低运营成本。为了实现这一目标,CPS建立了大规模的数据中心。但是,这导致资源不足和成本增加。改善资源利用的一种方法是收回未使用的零件并以较低的价格转售它们。在回收资源上向客户提供SLA保证是一个挑战,因为他们的波动性很高。一些最先进的解决方案考虑保留一定比例的资源来吸收工作量突然变化。其他人则考虑在波动性的资源上稳定的资源来填补损失的资源。但是,这些策略要么减少可回收资源的数量,要么在诸如Amazon Spot实例之类的波动性较小的资源上运行。在本文中,我们提出了Riscless,这是一种强化学习策略,以利用未使用的云资源。我们的方法包括使用一小部分稳定的按需资源与短暂的资源一起使用,以保证客户SLA并降低整体成本。该方法决定何时以及多少稳定资源可以分配以满足客户的需求。与最先进的策略相比,Riscless平均将CPS的利润提高了15.9%。它还将SLA违规时间平均减少了36.7%,同时将使用的临时资源的数量平均增加19.5%

One of the main objectives of Cloud Providers (CP) is to guarantee the Service-Level Agreement (SLA) of customers while reducing operating costs. To achieve this goal, CPs have built large-scale datacenters. This leads, however, to underutilized resources and an increase in costs. A way to improve the utilization of resources is to reclaim the unused parts and resell them at a lower price. Providing SLA guarantees to customers on reclaimed resources is a challenge due to their high volatility. Some state-of-the-art solutions consider keeping a proportion of resources free to absorb sudden variation in workloads. Others consider stable resources on top of the volatile ones to fill in for the lost resources. However, these strategies either reduce the amount of reclaimable resources or operate on less volatile ones such as Amazon Spot instance. In this paper, we proposed RISCLESS, a Reinforcement Learning strategy to exploit unused Cloud resources. Our approach consists of using a small proportion of stable on-demand resources alongside the ephemeral ones in order to guarantee customers SLA and reduce the overall costs. The approach decides when and how much stable resources to allocate in order to fulfill customers' demands. RISCLESS improved the CPs' profits by an average of 15.9% compared to state-of-the-art strategies. It also reduced the SLA violation time by an average of 36.7% while increasing the amount of used ephemeral resources by 19.5% on average

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源