记住并忘记经验重播多代理增强学习

论文标题

记住并忘记经验重播多代理增强学习

Remember and Forget Experience Replay for Multi-Agent Reinforcement Learning

论文作者

Weber, Pascal, Wälchli, Daniel, Zeqiri, Mustafa, Koumoutsakos, Petros

论文摘要

我们介绍了记住的扩展，并忘记了经验重播（Ref-ER）算法到多代理增强学习（MARL）。参考器被证明超过了最先进的算法状态，以连续控制从OpenAI健身房到复杂的液体流动。在MALL中，代理之间的依赖项包括在州值估计器中，环境动力学是通过参考文献使用的重要性权重对其建模的。在协作环境中，当使用个人奖励估算值时，我们发现最佳性能，而我们忽略了其他动作对过渡图的影响。我们在斯坦福大学智能系统实验室（SISL）环境中基准了Ref-er Marl的性能。我们发现，在参考文献中，采用单个馈送神经网络进行策略和价值函数，优于依靠复杂的神经网络体系结构的最先进的算法状态。

We present the extension of the Remember and Forget for Experience Replay (ReF-ER) algorithm to Multi-Agent Reinforcement Learning (MARL). ReF-ER was shown to outperform state of the art algorithms for continuous control in problems ranging from the OpenAI Gym to complex fluid flows. In MARL, the dependencies between the agents are included in the state-value estimator and the environment dynamics are modeled via the importance weights used by ReF-ER. In collaborative environments, we find the best performance when the value is estimated using individual rewards and we ignore the effects of other actions on the transition map. We benchmark the performance of ReF-ER MARL on the Stanford Intelligent Systems Laboratory (SISL) environments. We find that employing a single feed-forward neural network for the policy and the value function in ReF-ER MARL, outperforms state of the art algorithms that rely on complex neural network architectures.

下载PDF全文

下载文献需遵守相关版权规定

论文标题