梦和搜索控制：持续控制的潜在空间规划

论文标题

梦和搜索控制：持续控制的潜在空间规划

Dream and Search to Control: Latent Space Planning for Continuous Control

论文作者

Koul, Anurag, Kumar, Varun V., Fern, Alan, Majumdar, Somdeb

论文摘要

具有潜在空间动力学的学习和计划已被证明可用于基于模型的增强学习（MBRL）的样本效率（用于离散和连续控制任务）。特别是，对于离散的动作空间而言，最近的工作证明了通过蒙特卡洛树搜索（MCT）在学习期间和在测试时进行引导MBRL的有效性。但是，对于具有连续动作空间的环境，尚未证明潜在的潜在树搜索收益。在这项工作中，我们建议并探索一种基于基于树木的潜在动态的基于树木的计划的MBRL方法。我们表明，可以证明以前显示的离散空间所示的自举益处的类型。特别是，与最先进的方法相比，该方法在大多数具有挑战性的连续控制基准方面提高了样本效率和性能。

Learning and planning with latent space dynamics has been shown to be useful for sample efficiency in model-based reinforcement learning (MBRL) for discrete and continuous control tasks. In particular, recent work, for discrete action spaces, demonstrated the effectiveness of latent-space planning via Monte-Carlo Tree Search (MCTS) for bootstrapping MBRL during learning and at test time. However, the potential gains from latent-space tree search have not yet been demonstrated for environments with continuous action spaces. In this work, we propose and explore an MBRL approach for continuous action spaces based on tree-based planning over learned latent dynamics. We show that it is possible to demonstrate the types of bootstrapping benefits as previously shown for discrete spaces. In particular, the approach achieves improved sample efficiency and performance on a majority of challenging continuous-control benchmarks compared to the state-of-the-art.

下载PDF全文

下载文献需遵守相关版权规定

论文标题