语义传感器融合：从摄像机到稀疏底部信息

论文标题

语义传感器融合：从摄像机到稀疏底部信息

Semantic sensor fusion: from camera to sparse lidar information

论文作者

Berrio, Julie Stephany, Shan, Mao, Worrall, Stewart, Ward, James, Nebot, Eduardo

论文摘要

为了浏览城市道路，自动化的车辆必须能够在三维环境中感知和识别物体。对于计划和执行准确的驾驶演习，需要对周围环境进行高级上下文理解。本文提出了一种融合不同感官信息，光检测和范围（LIDAR）扫描和相机图像的方法。卷积神经网络（CNN）的输出用作分类器，以获取环境标签。标记的图像和激光雷达点云之间语义信息的转移分为四个步骤：最初，我们使用启发式方法将概率与标记图像中包含的所有语义类相关联。然后，鉴于每个激光雷达扫描的时间戳和相机图像之间的差异，LIDAR点得到校正以补偿车辆的运动。在第三步中，我们计算相应摄像机图像的像素坐标。在最后一步中，我们将语义信息从启发式概率图像传输到激光镜头框架，同时删除相机看不到的痛苦信息。我们在USYD数据集\ cite {usyd_dataset}中测试了我们的方法，获得了定性和定量结果，这些结果证明了我们的概率感觉融合方法的有效性。

To navigate through urban roads, an automated vehicle must be able to perceive and recognize objects in a three-dimensional environment. A high-level contextual understanding of the surroundings is necessary to plan and execute accurate driving maneuvers. This paper presents an approach to fuse different sensory information, Light Detection and Ranging (lidar) scans and camera images. The output of a convolutional neural network (CNN) is used as classifier to obtain the labels of the environment. The transference of semantic information between the labelled image and the lidar point cloud is performed in four steps: initially, we use heuristic methods to associate probabilities to all the semantic classes contained in the labelled images. Then, the lidar points are corrected to compensate for the vehicle's motion given the difference between the timestamps of each lidar scan and camera image. In a third step, we calculate the pixel coordinate for the corresponding camera image. In the last step we perform the transfer of semantic information from the heuristic probability images to the lidar frame, while removing the lidar information that is not visible to the camera. We tested our approach in the Usyd Dataset \cite{usyd_dataset}, obtaining qualitative and quantitative results that demonstrate the validity of our probabilistic sensory fusion approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题