通过空间差异来学习可扩展的多代理协调，以进行交通信号控制

论文标题

通过空间差异来学习可扩展的多代理协调，以进行交通信号控制

Learning Scalable Multi-Agent Coordination by Spatial Differentiation for Traffic Signal Control

论文作者

Liu, Junjia, Zhang, Huimin, Fu, Zhuang, Wang, Yao

论文摘要

交通信号的智能控制对于运输系统的优化至关重要。为了在大型道路网络中实现全球最佳的交通效率，最近的作品集中在交叉口之间的协调上，这些交叉口显示出令人鼓舞的结果。但是，现有研究更加关注交叉点之间的观察结果（均为明确和隐性），并且不在乎决策后的后果。在本文中，我们设计了一个基于深入加强信号控制方法的多种配位框架，该方法定义为γ-奖励，其中包括原始的γ-奖励和γ-注意力 - 奖励。具体而言，我们提出了用于协调的空间分化方法，该方法使用重播缓冲区中的时间空间信息来修改每个动作的奖励。简明的理论分析证明了所提出的模型可以融合到NASH平衡。通过将马尔可夫链的概念扩展到时空的维度，这种真正的分散协调机制取代了图形注意方法，并实现了道路网络的解耦，这更可扩展，并且更符合实践。仿真结果表明，提出的模型仍然是最先进的性能，甚至不使用集中设置。代码可在https://github.com/skylark0924/gamma奖励中找到。

The intelligent control of the traffic signal is critical to the optimization of transportation systems. To achieve global optimal traffic efficiency in large-scale road networks, recent works have focused on coordination among intersections, which have shown promising results. However, existing studies paid more attention to observations sharing among intersections (both explicit and implicit) and did not care about the consequences after decisions. In this paper, we design a multiagent coordination framework based on Deep Reinforcement Learning methods for traffic signal control, defined as γ-Reward that includes both original γ-Reward and γ-Attention-Reward. Specifically, we propose the Spatial Differentiation method for coordination which uses the temporal-spatial information in the replay buffer to amend the reward of each action. A concise theoretical analysis that proves the proposed model can converge to Nash equilibrium is given. By extending the idea of Markov Chain to the dimension of space-time, this truly decentralized coordination mechanism replaces the graph attention method and realizes the decoupling of the road network, which is more scalable and more in line with practice. The simulation results show that the proposed model remains a state-of-the-art performance even not use a centralized setting. Code is available in https://github.com/Skylark0924/Gamma Reward.

下载PDF全文

下载文献需遵守相关版权规定

论文标题