通过计划克隆学习视觉伺服策略

论文标题

通过计划克隆学习视觉伺服策略

Learning visual servo policies via planner cloning

论文作者

Viereck, Ulrich, Saenko, Kate, Platt, Robert

论文摘要

在新型环境中进行视觉宣传的学习控制政策是一个重要的问题。但是，无标准的策略学习方法很慢。本文探讨了计划者克隆：使用行为克隆来学习模仿全州运动计划者在模拟中的行为的策略。我们提出了惩罚Q克隆（PQC），一种新的行为克隆算法。我们表明，它在避免障碍的同时，它在一些涉及视觉杂物的一些具有挑战性的问题上胜过一些基准和消融。最后，我们证明这些策略可以有效地转移到真正的机器人平台上，在模拟和真实机器人中达到了大约87％的成功率。

Learning control policies for visual servoing in novel environments is an important problem. However, standard model-free policy learning methods are slow. This paper explores planner cloning: using behavior cloning to learn policies that mimic the behavior of a full-state motion planner in simulation. We propose Penalized Q Cloning (PQC), a new behavior cloning algorithm. We show that it outperforms several baselines and ablations on some challenging problems involving visual servoing in novel environments while avoiding obstacles. Finally, we demonstrate that these policies can be transferred effectively onto a real robotic platform, achieving approximately an 87% success rate both in simulation and on a real robot.

下载PDF全文

下载文献需遵守相关版权规定

论文标题