在使用深入的强化学习和SIM卡转移的边缘解决火箭联盟

论文标题

在使用深入的强化学习和SIM卡转移的边缘解决火箭联盟

On the Verge of Solving Rocket League using Deep Reinforcement Learning and Sim-to-sim Transfer

论文作者

Pleines, Marco, Ramthun, Konstantin, Wegener, Yannik, Meyer, Hendrik, Pallasch, Matthias, Prior, Sebastian, Drögemüller, Jannik, Büttinghaus, Leon, Röthemeyer, Thilo, Kaschwig, Alexander, Chmurzynski, Oliver, Rohkrähmer, Frederik, Kalkreuth, Roman, Zimmer, Frank, Preuss, Mike

论文摘要

应很好地玩视频游戏的自主训练的代理商在同时运行的数千台机器上非常依赖于快速模拟速度或重型并行化。这项工作探讨了在机器人技术，即模拟传输转移的第三种方式，或者如果将游戏视为模拟本身，则可以使用SIMS到SIM转移。就火箭联盟而言，我们证明了守门员和前锋的单一行为可以通过在模拟环境中进行深入的强化学习成功地学习，并转移回原始游戏。尽管实施的训练模拟在某种程度上是不准确的，但守门员一旦转移就节省了近100％的面部射击，而罢工代理在约75％的案件中得分。因此，训练有素的特工足够强大，能够概括到火箭联盟的目标领域。

Autonomously trained agents that are supposed to play video games reasonably well rely either on fast simulation speeds or heavy parallelization across thousands of machines running concurrently. This work explores a third way that is established in robotics, namely sim-to-real transfer, or if the game is considered a simulation itself, sim-to-sim transfer. In the case of Rocket League, we demonstrate that single behaviors of goalies and strikers can be successfully learned using Deep Reinforcement Learning in the simulation environment and transferred back to the original game. Although the implemented training simulation is to some extent inaccurate, the goalkeeping agent saves nearly 100% of its faced shots once transferred, while the striking agent scores in about 75% of cases. Therefore, the trained agent is robust enough and able to generalize to the target domain of Rocket League.

下载PDF全文

下载文献需遵守相关版权规定

论文标题