使用学习的2D-3D点线对应关系在交通场景中基于稀疏的语义图的单眼定位

论文标题

使用学习的2D-3D点线对应关系在交通场景中基于稀疏的语义图的单眼定位

Sparse Semantic Map-Based Monocular Localization in Traffic Scenes Using Learned 2D-3D Point-Line Correspondences

论文作者

Chen, Xingyu, Xue, Jianru, Pang, Shanmin

论文摘要

基于视觉的本地化在先前的地图中对自动驾驶至关重要。给定查询图像，目标是估计与先前地图相对应的相机姿势，关键是地图中相机图像的注册问题。虽然自动驾驶汽车在遮挡下（例如，汽车，公共汽车，卡车）和环境外观（例如照明变化，季节性变化）行驶，但现有方法在很大程度上依赖于特征级别的密集点描述符来解决注册问题，从而使外观和遮挡纠缠在一起。结果，他们通常无法估计正确的姿势。为了解决这些问题，我们提出了一种基于语义图的稀疏单眼定位方法，该方法通过精心设计的深神经网络解决了2d-3d注册。给定一个稀疏的语义图由具有多个语义标签的简化元素（例如极线，流量符号中点）组成，然后通过学习图像中的2D语义元素与稀疏语义映射的3D元素之间的相应特征来估算相机姿势。提出的基于语义图的稀疏语义定位方法是可靠的，可抵抗环境中的闭塞和长期外观变化。广泛的实验结果表明，所提出的方法的表现优于最先进的方法。

Vision-based localization in a prior map is of crucial importance for autonomous vehicles. Given a query image, the goal is to estimate the camera pose corresponding to the prior map, and the key is the registration problem of camera images within the map. While autonomous vehicles drive on the road under occlusion (e.g., car, bus, truck) and changing environment appearance (e.g., illumination changes, seasonal variation), existing approaches rely heavily on dense point descriptors at the feature level to solve the registration problem, entangling features with appearance and occlusion. As a result, they often fail to estimate the correct poses. To address these issues, we propose a sparse semantic map-based monocular localization method, which solves 2D-3D registration via a well-designed deep neural network. Given a sparse semantic map that consists of simplified elements (e.g., pole lines, traffic sign midpoints) with multiple semantic labels, the camera pose is then estimated by learning the corresponding features between the 2D semantic elements from the image and the 3D elements from the sparse semantic map. The proposed sparse semantic map-based localization approach is robust against occlusion and long-term appearance changes in the environments. Extensive experimental results show that the proposed method outperforms the state-of-the-art approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题