论文标题
LBGP:基于学习的目标计划,以实现前面的自主追随者
LBGP: Learning Based Goal Planning for Autonomous Following in Front
论文作者
论文摘要
本文研究了一种混合解决方案,该解决方案结合了深度加固学习(RL)和经典的轨迹计划,以进行以下应用程序。在这里,一个自主机器人旨在在人自由走动时保持领先。在前面是一个具有挑战性的问题,因为用户的预期轨迹未知,需要由机器人明确或隐式估算。此外,机器人需要找到一种可行的方法来安全导航在人类轨迹之前。我们的深度RL模块隐含地估算人类轨迹,并产生短期导航目标来指导机器人。这些目标由轨迹规划师使用,以将机器人顺畅地导航到短期目标,并最终在用户面前。我们在Deep RL模块中采用课程学习,以有效地获得高回报。我们的系统在以下方面的表现优于最先进的,并且与模拟和现实世界实验中的端到端替代方案相比,更可靠。与纯净的RL方法相反,我们证明了训练有素的政策从模拟到现实世界的零射击。
This paper investigates a hybrid solution which combines deep reinforcement learning (RL) and classical trajectory planning for the following in front application. Here, an autonomous robot aims to stay ahead of a person as the person freely walks around. Following in front is a challenging problem as the user's intended trajectory is unknown and needs to be estimated, explicitly or implicitly, by the robot. In addition, the robot needs to find a feasible way to safely navigate ahead of human trajectory. Our deep RL module implicitly estimates human trajectory and produces short-term navigational goals to guide the robot. These goals are used by a trajectory planner to smoothly navigate the robot to the short-term goals, and eventually in front of the user. We employ curriculum learning in the deep RL module to efficiently achieve a high return. Our system outperforms the state-of-the-art in following ahead and is more reliable compared to end-to-end alternatives in both the simulation and real world experiments. In contrast to a pure deep RL approach, we demonstrate zero-shot transfer of the trained policy from simulation to the real world.