MVP：大规模机器人导航的统一运动和视觉自我监督学习

论文标题

MVP：大规模机器人导航的统一运动和视觉自我监督学习

MVP: Unified Motion and Visual Self-Supervised Learning for Large-Scale Robotic Navigation

论文作者

Chancán, Marvin, Milford, Michael

论文摘要

自主导航来自现实世界环境中的运动和局部视觉感知。但是，大多数成功的机器人运动估计方法（例如VO，SLAM，SFM）和视觉系统（例如CNN，Visual Place识别-VPR）通常分别用于映射和本地化任务。相反，最近的强化学习（RL）用于视觉导航的方法取决于GPS数据接收的质量，当直接将其用作大型环境中的多个月间隔遍历的地面真理时，这可能并不可靠。在本文中，我们提出了一种新型的运动和视觉感知方法，称为MVP，该方法将这两种传感器方式统一了大规模，目标驱动的导航任务。我们基于MVP的方法可以更快地学习，并且比相应的仅视觉导航方法更准确，对GPS数据更准确，更健壮。 MVP在时间上合并了使用VPR获得的紧凑图像表示，并具有优化的运动估计数据，包括但不限于VO或优化的雷达探测仪（RO），以通过RL有效地学习自我监督的导航策略。我们在两个大型现实世界数据集（牛津机器人汽车和诺德兰铁路）上评估了我们的方法，这些天气（例如，使用新的City Citylearn Framework的天气（例如，阴天，夜晚，雪，太阳，雨水，雨，雨，雨，雨，雨，云，云，云，云，云，云，春季，春季，秋季，夏季）条件；一个有效训练导航代理的互动环境。我们在没有GPS数据的牛津机器人数据集的遍历上的实验结果表明，使用VO和RO的MVP分别可以实现53％和93％的导航成功率，而仅使用视觉方法为7％。我们还报告了RL成功率和运动估计精度之间的权衡。

Autonomous navigation emerges from both motion and local visual perception in real-world environments. However, most successful robotic motion estimation methods (e.g. VO, SLAM, SfM) and vision systems (e.g. CNN, visual place recognition-VPR) are often separately used for mapping and localization tasks. Conversely, recent reinforcement learning (RL) based methods for visual navigation rely on the quality of GPS data reception, which may not be reliable when directly using it as ground truth across multiple, month-spaced traversals in large environments. In this paper, we propose a novel motion and visual perception approach, dubbed MVP, that unifies these two sensor modalities for large-scale, target-driven navigation tasks. Our MVP-based method can learn faster, and is more accurate and robust to both extreme environmental changes and poor GPS data than corresponding vision-only navigation methods. MVP temporally incorporates compact image representations, obtained using VPR, with optimized motion estimation data, including but not limited to those from VO or optimized radar odometry (RO), to efficiently learn self-supervised navigation policies via RL. We evaluate our method on two large real-world datasets, Oxford Robotcar and Nordland Railway, over a range of weather (e.g. overcast, night, snow, sun, rain, clouds) and seasonal (e.g. winter, spring, fall, summer) conditions using the new CityLearn framework; an interactive environment for efficiently training navigation agents. Our experimental results, on traversals of the Oxford RobotCar dataset with no GPS data, show that MVP can achieve 53% and 93% navigation success rate using VO and RO, respectively, compared to 7% for a vision-only method. We additionally report a trade-off between the RL success rate and the motion estimation precision.

下载PDF全文

下载文献需遵守相关版权规定

论文标题