通过自组织生成模型进行多种模仿学习

论文标题

通过自组织生成模型进行多种模仿学习

Diverse Imitation Learning via Self-Organizing Generative Models

论文作者

Vahabpour, Arash, Wang, Tianyi, Lu, Qiujing, Pooladzandi, Omead, Roychowdhury, Vwani

论文摘要

模仿学习是从演示中复制专家政策的任务，而无需访问奖励功能。当专家表现出各种行为时，这项任务变得尤其具有挑战性。先前的工作已经引入了潜在变量来建模专家政策的变化。但是，我们的实验表明，现有作品没有表现出对各个模式的适当模仿。为了解决这个问题，我们采用了一个无编码器的生成模型来克隆（BC），以准确区分和模仿不同的模式。然后，我们将其与盖尔集成在一起，以使学习良好，以在看不见的州的复合错误中进行。我们表明，我们的方法在多个实验中大大优于最新技术。

Imitation learning is the task of replicating expert policy from demonstrations, without access to a reward function. This task becomes particularly challenging when the expert exhibits a mixture of behaviors. Prior work has introduced latent variables to model variations of the expert policy. However, our experiments show that the existing works do not exhibit appropriate imitation of individual modes. To tackle this problem, we adopt an encoder-free generative model for behavior cloning (BC) to accurately distinguish and imitate different modes. Then, we integrate it with GAIL to make the learning robust towards compounding errors at unseen states. We show that our method significantly outperforms the state of the art across multiple experiments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题