在多进球增强学习中，Stein变异目标生成自适应探索

论文标题

在多进球增强学习中，Stein变异目标生成自适应探索

Stein Variational Goal Generation for adaptive Exploration in Multi-Goal Reinforcement Learning

论文作者

Castanet, Nicolas, Lamprier, Sylvain, Sigaud, Olivier

论文摘要

在多进球的强化学习中，代理可以在相关培训任务之间分享经验，从而在测试时对新任务进行更好的概括。但是，当目标空间不连续并且奖励很少时，大多数目标就很难实现。在这种情况下，关于目标的课程通过将培训任务调整为当前功能来帮助代理人学习。在这项工作中，我们提出了Stein变化目标生成（SVGG），该目标是通过利用学习目标达到能力的学习预测模型来对代理商进行中间难度的采样。目标的分布以使用Stein变化梯度下降在适当难度的领域吸引的颗粒进行建模。我们表明，SVGG在硬探索问题中的成功覆盖范围方面优于最先进的多进球增强学习方法，并证明当环境变化时，它会赋予其有用的恢复属性。

In multi-goal Reinforcement Learning, an agent can share experience between related training tasks, resulting in better generalization for new tasks at test time. However, when the goal space has discontinuities and the reward is sparse, a majority of goals are difficult to reach. In this context, a curriculum over goals helps agents learn by adapting training tasks to their current capabilities. In this work we propose Stein Variational Goal Generation (SVGG), which samples goals of intermediate difficulty for the agent, by leveraging a learned predictive model of its goal reaching capabilities. The distribution of goals is modeled with particles that are attracted in areas of appropriate difficulty using Stein Variational Gradient Descent. We show that SVGG outperforms state-of-the-art multi-goal Reinforcement Learning methods in terms of success coverage in hard exploration problems, and demonstrate that it is endowed with a useful recovery property when the environment changes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题