相互作用模式分开多代理增强学习

论文标题

相互作用模式分开多代理增强学习

Interaction Pattern Disentangling for Multi-Agent Reinforcement Learning

论文作者

Liu, Shunyu, Song, Jie, Zhou, Yihe, Yu, Na, Chen, Kaixuan, Feng, Zunlei, Song, Mingli

论文摘要

深层合作的多机构增强学习已经证明了其在各种复杂的控制任务上取得了巨大的成功。但是，多学院学习的最新进展主要集中在价值分解上，而使实体交互仍然交织在一起，这很容易导致实体之间嘈杂的互动过度拟合。在这项工作中，我们引入了一种新型的交互模式分离（OPT）方法，以将实体相互作用分解为相互作用原型，每个实体都代表了实体亚组中的基本相互作用模式。 OPT促进过滤无关实体之间的嘈杂相互作用，从而显着提高了普遍性和可解释性。具体而言，OPT引入了稀疏的分歧机制，以鼓励发现的相互作用原型之间的稀疏性和多样性。然后，模型将这些原型选择性地重组为具有可学习权重的聚合器的紧凑相互作用模式。为了减轻部分可观察性引起的训练不稳定性问题，我们建议最大化聚合权重与每个代理的历史行为之间的相互信息。单任务，多任务和零射基准测试的实验表明，所提出的方法得出的结果优于最先进的对应。我们的代码可在https://github.com/liushunyu/opt上找到。

Deep cooperative multi-agent reinforcement learning has demonstrated its remarkable success over a wide spectrum of complex control tasks. However, recent advances in multi-agent learning mainly focus on value decomposition while leaving entity interactions still intertwined, which easily leads to over-fitting on noisy interactions between entities. In this work, we introduce a novel interactiOn Pattern disenTangling (OPT) method, to disentangle the entity interactions into interaction prototypes, each of which represents an underlying interaction pattern within a subgroup of the entities. OPT facilitates filtering the noisy interactions between irrelevant entities and thus significantly improves generalizability as well as interpretability. Specifically, OPT introduces a sparse disagreement mechanism to encourage sparsity and diversity among discovered interaction prototypes. Then the model selectively restructures these prototypes into a compact interaction pattern by an aggregator with learnable weights. To alleviate the training instability issue caused by partial observability, we propose to maximize the mutual information between the aggregation weights and the history behaviors of each agent. Experiments on single-task, multi-task and zero-shot benchmarks demonstrate that the proposed method yields results superior to the state-of-the-art counterparts. Our code is available at https://github.com/liushunyu/OPT.

下载PDF全文

下载文献需遵守相关版权规定

论文标题