通过紧急沟通的网络多机构增强学习

论文标题

通过紧急沟通的网络多机构增强学习

Networked Multi-Agent Reinforcement Learning with Emergent Communication

论文作者

Gupta, Shubham, Hazra, Rishi, Dukkipati, Ambedkar

论文摘要

多代理增强学习（MARL）方法为在其他学习剂存在下运行的代理找到最佳政策。实现这一目标的核心是代理如何协调。协调的一种方法是学习相互交流。代理商可以在学习执行共同任务时发展一种语言吗？在本文中，我们制定并研究了MARL问题，其中合作代理通过固定的基础网络相互联系。这些代理可以通过交换离散符号沿该网络的边缘进行通信。但是，这些符号的语义不是预定义的，在培训期间，代理需要开发一种有助于他们实现目标的语言。我们提出了一种使用紧急通信训练这些代理的方法。我们通过将其应用于管理流量控制器的问题来证明拟议框架的适用性，与许多强大的基线相比，我们在该问题上实现了最先进的性能。更重要的是，我们对紧急沟通进行详细的分析，例如表明开发的语言是基础的，并证明了其与基础网络拓扑的关系。据我们所知，这是唯一对网络MARL环境中新兴沟通进行深入分析的工作，同时适用于广泛的问题。

Multi-Agent Reinforcement Learning (MARL) methods find optimal policies for agents that operate in the presence of other learning agents. Central to achieving this is how the agents coordinate. One way to coordinate is by learning to communicate with each other. Can the agents develop a language while learning to perform a common task? In this paper, we formulate and study a MARL problem where cooperative agents are connected to each other via a fixed underlying network. These agents can communicate along the edges of this network by exchanging discrete symbols. However, the semantics of these symbols are not predefined and, during training, the agents are required to develop a language that helps them in accomplishing their goals. We propose a method for training these agents using emergent communication. We demonstrate the applicability of the proposed framework by applying it to the problem of managing traffic controllers, where we achieve state-of-the-art performance as compared to a number of strong baselines. More importantly, we perform a detailed analysis of the emergent communication to show, for instance, that the developed language is grounded and demonstrate its relationship with the underlying network topology. To the best of our knowledge, this is the only work that performs an in depth analysis of emergent communication in a networked MARL setting while being applicable to a broad class of problems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题