用于成本和延迟敏感的虚拟网络功能放置和路由的多代理深度强化学习

论文标题

用于成本和延迟敏感的虚拟网络功能放置和路由的多代理深度强化学习

Multi-Agent Deep Reinforcement Learning for Cost- and Delay-Sensitive Virtual Network Function Placement and Routing

论文作者

Wang, Shaoyang, Yuen, Chau, Ni, Wei, Liang, Guan Yong, Lv, Tiejun

论文摘要

本文提出了一种有效且新颖的多重深度加固学习（MADRL）的方法，用于解决联合虚拟网络功能（VNF）放置和路由（P＆R），其中同时提供了具有差异性要求的多个服务请求。服务请求的差异要求反映出其延迟和成本敏感的因素。我们首先构建了VNF P＆R问题，以共同减少NP完整的服务延迟和资源消耗成本的加权总和。然后，将关节VNF P＆R问题分解为两个迭代子任务：放置子任务和路由子任务。每个子任务由多个并发的并行顺序决策过程组成。通过调用深层确定性策略梯度方法和多代理技术，MADRL-P＆R框架旨在执行两个子任务。提出了新的联合奖励和内部奖励机制，以匹配安置和路由子任务的目标和约束。我们还提出了基于参数迁移的模型重新训练方法来处理不断变化的网络拓扑。通过实验证实，提议的MADRL-P＆R框架在服务成本和延迟方面优于其替代方案，并为个性化服务需求提供了更高的灵活性。基于参数迁移的模型重新训练方法可以在中等网络拓扑变化下有效加速收敛。

This paper proposes an effective and novel multiagent deep reinforcement learning (MADRL)-based method for solving the joint virtual network function (VNF) placement and routing (P&R), where multiple service requests with differentiated demands are delivered at the same time. The differentiated demands of the service requests are reflected by their delay- and cost-sensitive factors. We first construct a VNF P&R problem to jointly minimize a weighted sum of service delay and resource consumption cost, which is NP-complete. Then, the joint VNF P&R problem is decoupled into two iterative subtasks: placement subtask and routing subtask. Each subtask consists of multiple concurrent parallel sequential decision processes. By invoking the deep deterministic policy gradient method and multi-agent technique, an MADRL-P&R framework is designed to perform the two subtasks. The new joint reward and internal rewards mechanism is proposed to match the goals and constraints of the placement and routing subtasks. We also propose the parameter migration-based model-retraining method to deal with changing network topologies. Corroborated by experiments, the proposed MADRL-P&R framework is superior to its alternatives in terms of service cost and delay, and offers higher flexibility for personalized service demands. The parameter migration-based model-retraining method can efficiently accelerate convergence under moderate network topology changes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题