通过连接驱动的沟通来学习多代理协调

论文标题

通过连接驱动的沟通来学习多代理协调

Learning Multi-Agent Coordination through Connectivity-driven Communication

论文作者

Pesce, Emanuele, Montana, Giovanni

论文摘要

在人工多代理系统中，学习协作政策的能力是基于代理商的沟通技巧：他们必须能够编码从环境中收到的信息，并学习如何根据手头的任务与其他代理商分享。我们提出了一种深厚的加强学习方法，连通性驱动的沟通（CDC），该方法仅通过经验来促进多代理协作行为的出现。代理被建模为加权图的节点，其状态依赖的边缘编码可以交换的配对消息。我们介绍了一种依赖图的注意机制，该机制控制着代理的传入消息的加权方式。该机制完全考虑到图表所示的系统的当前状态，并建立在捕获图表上信息流动方式的扩散过程基础上。图形拓扑不是被认为是先验的，而是动态取决于代理的观察结果，并以端到端的方式同时学习了注意机制和政策。我们的经验结果表明，疾病预防控制中心能够学习有效的协作政策，并且可以超越合作导航任务的竞争性学习算法。

In artificial multi-agent systems, the ability to learn collaborative policies is predicated upon the agents' communication skills: they must be able to encode the information received from the environment and learn how to share it with other agents as required by the task at hand. We present a deep reinforcement learning approach, Connectivity Driven Communication (CDC), that facilitates the emergence of multi-agent collaborative behaviour only through experience. The agents are modelled as nodes of a weighted graph whose state-dependent edges encode pair-wise messages that can be exchanged. We introduce a graph-dependent attention mechanisms that controls how the agents' incoming messages are weighted. This mechanism takes into full account the current state of the system as represented by the graph, and builds upon a diffusion process that captures how the information flows on the graph. The graph topology is not assumed to be known a priori, but depends dynamically on the agents' observations, and is learnt concurrently with the attention mechanism and policy in an end-to-end fashion. Our empirical results show that CDC is able to learn effective collaborative policies and can over-perform competing learning algorithms on cooperative navigation tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题