使用语义和上下文对象信息进行机器人导航扩展地图：使用视觉和深度提示的基于学习的框架

论文标题

使用语义和上下文对象信息进行机器人导航扩展地图：使用视觉和深度提示的基于学习的框架

Extending Maps with Semantic and Contextual Object Information for Robot Navigation: a Learning-Based Framework using Visual and Depth Cues

论文作者

Martins, Renato, Bersan, Dhiego, Campos, Mario F. M., Nascimento, Erickson R.

论文摘要

本文解决了从RGB-D图像中使用语义信息来构建场景增强度量表示的问题。我们提出了一个完整的框架，以使用对象级信息来创建一个增强的环境图表，以在多种应用程序中使用，例如人类机器人交互，辅助机器人技术，视觉导航或在操纵任务中。我们的公式利用基于CNN的对象检测器（YOLO）采用基于3D模型的分割技术来执行实例语义分割，并定位，识别，识别和跟踪场景中不同类别的对象类。语义类的跟踪和定位是使用Kalman过滤器的词典来完成的，以便在随着时间的推移中结合传感器测量，然后提供更准确的地图。该公式旨在识别和无视动态对象，以获得中期不变的地图表示。通过在不同的室内场景中获得的收集和公开可用的RGB-D数据序列评估所提出的方法。实验结果表明，该技术产生包含多个对象（尤其是门）的增强语义图的潜力。我们还为社区提供了一个由带注释的对象类（门，灭火器，长凳，喷泉）组成的数据集及其定位，以及源代码为ROS软件包。

This paper addresses the problem of building augmented metric representations of scenes with semantic information from RGB-D images. We propose a complete framework to create an enhanced map representation of the environment with object-level information to be used in several applications such as human-robot interaction, assistive robotics, visual navigation, or in manipulation tasks. Our formulation leverages a CNN-based object detector (Yolo) with a 3D model-based segmentation technique to perform instance semantic segmentation, and to localize, identify, and track different classes of objects in the scene. The tracking and positioning of semantic classes is done with a dictionary of Kalman filters in order to combine sensor measurements over time and then providing more accurate maps. The formulation is designed to identify and to disregard dynamic objects in order to obtain a medium-term invariant map representation. The proposed method was evaluated with collected and publicly available RGB-D data sequences acquired in different indoor scenes. Experimental results show the potential of the technique to produce augmented semantic maps containing several objects (notably doors). We also provide to the community a dataset composed of annotated object classes (doors, fire extinguishers, benches, water fountains) and their positioning, as well as the source code as ROS packages.

下载PDF全文

下载文献需遵守相关版权规定

论文标题