LBGP：基于学习的目标计划，以实现前面的自主追随者

论文标题

LBGP：基于学习的目标计划，以实现前面的自主追随者

LBGP: Learning Based Goal Planning for Autonomous Following in Front

论文作者

Nikdel, Payam, Vaughan, Richard, Chen, Mo

论文摘要

本文研究了一种混合解决方案，该解决方案结合了深度加固学习（RL）和经典的轨迹计划，以进行以下应用程序。在这里，一个自主机器人旨在在人自由走动时保持领先。在前面是一个具有挑战性的问题，因为用户的预期轨迹未知，需要由机器人明确或隐式估算。此外，机器人需要找到一种可行的方法来安全导航在人类轨迹之前。我们的深度RL模块隐含地估算人类轨迹，并产生短期导航目标来指导机器人。这些目标由轨迹规划师使用，以将机器人顺畅地导航到短期目标，并最终在用户面前。我们在Deep RL模块中采用课程学习，以有效地获得高回报。我们的系统在以下方面的表现优于最先进的，并且与模拟和现实世界实验中的端到端替代方案相比，更可靠。与纯净的RL方法相反，我们证明了训练有素的政策从模拟到现实世界的零射击。

This paper investigates a hybrid solution which combines deep reinforcement learning (RL) and classical trajectory planning for the following in front application. Here, an autonomous robot aims to stay ahead of a person as the person freely walks around. Following in front is a challenging problem as the user's intended trajectory is unknown and needs to be estimated, explicitly or implicitly, by the robot. In addition, the robot needs to find a feasible way to safely navigate ahead of human trajectory. Our deep RL module implicitly estimates human trajectory and produces short-term navigational goals to guide the robot. These goals are used by a trajectory planner to smoothly navigate the robot to the short-term goals, and eventually in front of the user. We employ curriculum learning in the deep RL module to efficiently achieve a high return. Our system outperforms the state-of-the-art in following ahead and is more reliable compared to end-to-end alternatives in both the simulation and real world experiments. In contrast to a pure deep RL approach, we demonstrate zero-shot transfer of the trained policy from simulation to the real world.

下载PDF全文

下载文献需遵守相关版权规定

论文标题