论文标题
分布式资源分配,以及用于5G-V2V通信的多代理深入学习
Distributed Resource Allocation with Multi-Agent Deep Reinforcement Learning for 5G-V2V Communication
论文作者
论文摘要
在没有基站的情况下,我们考虑车辆到车辆(V2V)通信中的分布式资源选择问题。每辆车自主从共享资源池中选择传输资源,以传播合作意识消息(CAM)。这是一个共识问题,每辆车都必须选择一个唯一的资源。由于流动性 - 彼此附近的车辆数量在动态变化时,问题变得更加具有挑战性。在拥挤的情况下,每辆车的独特资源分配变得不可行,必须制定拥挤的资源分配策略。 5G中的标准化方法,即半持久的调度(SP)受到车辆空间分布引起的影响。在我们的方法中,我们将其变成了优势。我们使用多代理增强学习(Diral)提出了一种新颖的分布式资源分配机制,该机制以独特的状态表示为基础。一个具有挑战性的问题是应对同时学习的代理所引入的非平稳性,该非平稳性在多代理学习系统中引起收敛问题。我们旨在以独特的状态代表来解决非平稳性。具体而言,我们将基于视图的位置分布作为州表示,以解决非平稳性并以分布式方式执行复杂的联合行为。我们的结果表明,在充满挑战的拥挤情况下,与SPS相比,Diral将PRR提高了20%。
We consider the distributed resource selection problem in Vehicle-to-vehicle (V2V) communication in the absence of a base station. Each vehicle autonomously selects transmission resources from a pool of shared resources to disseminate Cooperative Awareness Messages (CAMs). This is a consensus problem where each vehicle has to select a unique resource. The problem becomes more challenging when---due to mobility---the number of vehicles in vicinity of each other is changing dynamically. In a congested scenario, allocation of unique resources for each vehicle becomes infeasible and a congested resource allocation strategy has to be developed. The standardized approach in 5G, namely semi-persistent scheduling (SPS) suffers from effects caused by spatial distribution of the vehicles. In our approach, we turn this into an advantage. We propose a novel DIstributed Resource Allocation mechanism using multi-agent reinforcement Learning (DIRAL) which builds on a unique state representation. One challenging issue is to cope with the non-stationarity introduced by concurrently learning agents which causes convergence problems in multi-agent learning systems. We aimed to tackle non-stationarity with unique state representation. Specifically, we deploy view-based positional distribution as a state representation to tackle non-stationarity and perform complex joint behavior in a distributed fashion. Our results showed that DIRAL improves PRR by 20% compared to SPS in challenging congested scenarios.