分布式资源分配，以及用于5G-V2V通信的多代理深入学习

论文标题

分布式资源分配，以及用于5G-V2V通信的多代理深入学习

Distributed Resource Allocation with Multi-Agent Deep Reinforcement Learning for 5G-V2V Communication

论文作者

Gündogan, Alperen, Gürsu, H. Murat, Pauli, Volker, Kellerer, Wolfgang

论文摘要

在没有基站的情况下，我们考虑车辆到车辆（V2V）通信中的分布式资源选择问题。每辆车自主从共享资源池中选择传输资源，以传播合作意识消息（CAM）。这是一个共识问题，每辆车都必须选择一个唯一的资源。由于流动性 - 彼此附近的车辆数量在动态变化时，问题变得更加具有挑战性。在拥挤的情况下，每辆车的独特资源分配变得不可行，必须制定拥挤的资源分配策略。 5G中的标准化方法，即半持久的调度（SP）受到车辆空间分布引起的影响。在我们的方法中，我们将其变成了优势。我们使用多代理增强学习（Diral）提出了一种新颖的分布式资源分配机制，该机制以独特的状态表示为基础。一个具有挑战性的问题是应对同时学习的代理所引入的非平稳性，该非平稳性在多代理学习系统中引起收敛问题。我们旨在以独特的状态代表来解决非平稳性。具体而言，我们将基于视图的位置分布作为州表示，以解决非平稳性并以分布式方式执行复杂的联合行为。我们的结果表明，在充满挑战的拥挤情况下，与SPS相比，Diral将PRR提高了20％。

We consider the distributed resource selection problem in Vehicle-to-vehicle (V2V) communication in the absence of a base station. Each vehicle autonomously selects transmission resources from a pool of shared resources to disseminate Cooperative Awareness Messages (CAMs). This is a consensus problem where each vehicle has to select a unique resource. The problem becomes more challenging when---due to mobility---the number of vehicles in vicinity of each other is changing dynamically. In a congested scenario, allocation of unique resources for each vehicle becomes infeasible and a congested resource allocation strategy has to be developed. The standardized approach in 5G, namely semi-persistent scheduling (SPS) suffers from effects caused by spatial distribution of the vehicles. In our approach, we turn this into an advantage. We propose a novel DIstributed Resource Allocation mechanism using multi-agent reinforcement Learning (DIRAL) which builds on a unique state representation. One challenging issue is to cope with the non-stationarity introduced by concurrently learning agents which causes convergence problems in multi-agent learning systems. We aimed to tackle non-stationarity with unique state representation. Specifically, we deploy view-based positional distribution as a state representation to tackle non-stationarity and perform complex joint behavior in a distributed fashion. Our results showed that DIRAL improves PRR by 20% compared to SPS in challenging congested scenarios.

下载PDF全文

下载文献需遵守相关版权规定

论文标题