通过基于课程的任务抽样的强大元提升学习

论文标题

通过基于课程的任务抽样的强大元提升学习

Robust Meta-Reinforcement Learning with Curriculum-Based Task Sampling

论文作者

Matsumoto, Morio, Matsuba, Hiroya, Kujirai, Toshihiro

论文摘要

元强化学习（META-RL）获得了元过程，这些元件在广泛的任务分布中表现出良好的性能。但是，据报道，通过随机抽样任务来学习元过程的常规元rl可以显示某些任务的元超越，尤其是对于代理可以轻松获得高分的简单任务。为了减少元拟合的效果，我们考虑了基于课程的任务抽样的元rl。我们的方法是具有指导任务采样（RMRL-GT）的稳健元加强学习，这是一种有效的方法，可根据分数和时期限制任务采样。我们表明，为了实现强大的元素级，不仅有必要进行得分较差的密集型任务，而且还必须限制和扩展要采样的任务的任务区域。

Meta-reinforcement learning (meta-RL) acquires meta-policies that show good performance for tasks in a wide task distribution. However, conventional meta-RL, which learns meta-policies by randomly sampling tasks, has been reported to show meta-overfitting for certain tasks, especially for easy tasks where an agent can easily get high scores. To reduce effects of the meta-overfitting, we considered meta-RL with curriculum-based task sampling. Our method is Robust Meta Reinforcement Learning with Guided Task Sampling (RMRL-GTS), which is an effective method that restricts task sampling based on scores and epochs. We show that in order to achieve robust meta-RL, it is necessary not only to intensively sample tasks with poor scores, but also to restrict and expand the task regions of the tasks to be sampled.

下载PDF全文

下载文献需遵守相关版权规定

论文标题