L2B：学习平衡交互式人群感知机器人导航中的安全效率权衡

论文标题

L2B：学习平衡交互式人群感知机器人导航中的安全效率权衡

L2B: Learning to Balance the Safety-Efficiency Trade-off in Interactive Crowd-aware Robot Navigation

论文作者

Nishimura, Mai, Yonetani, Ryo

论文摘要

这项工作为在拥挤的地方提供了一个深厚的增强学习框架。我们提出的方法，学习平衡（L2B）框架（L2B）框架使移动机器人代理通过避免与人群发生碰撞来安全地朝目的地，同时通过要求附近的行人在必要的情况下腾出空间来积极清理一条道路，以保持旅行效率。我们观察到，在经纪人和人群之间存在社会困境的情况下，人群感知导航的安全性和效率要求取决于权衡。一方面，介入行人路太多以至于无法达到即时效率，将导致自然的人群流动，并最终可能会使包括自我在内的每个人都面临碰撞的风险。另一方面，保持沉默以避免每一次碰撞会导致代理商的效率低下。通过此观察，我们的L2B框架增加了用于学习交互式导航政策，以惩罚频繁的主动路径清理和被动碰撞避免，从而大大提高了安全效率折衷的平衡。我们在具有挑战性的人群模拟中评估了L2B框架，并在导航的成功和碰撞速度方面比最先进的导航方法证明了它的优势。

This work presents a deep reinforcement learning framework for interactive navigation in a crowded place. Our proposed approach, Learning to Balance (L2B) framework enables mobile robot agents to steer safely towards their destinations by avoiding collisions with a crowd, while actively clearing a path by asking nearby pedestrians to make room, if necessary, to keep their travel efficient. We observe that the safety and efficiency requirements in crowd-aware navigation have a trade-off in the presence of social dilemmas between the agent and the crowd. On the one hand, intervening in pedestrian paths too much to achieve instant efficiency will result in collapsing a natural crowd flow and may eventually put everyone, including the self, at risk of collisions. On the other hand, keeping in silence to avoid every single collision will lead to the agent's inefficient travel. With this observation, our L2B framework augments the reward function used in learning an interactive navigation policy to penalize frequent active path clearing and passive collision avoidance, which substantially improves the balance of the safety-efficiency trade-off. We evaluate our L2B framework in a challenging crowd simulation and demonstrate its superiority, in terms of both navigation success and collision rate, over a state-of-the-art navigation approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题