用于视觉进程的伪巨头

论文标题

用于视觉进程的伪巨头

Pseudo-LiDAR for Visual Odometry

论文作者

Deng, Huiying, Wang, Guangming, Feng, Zhiheng, Jiang, Chaokang, Wu, Xinrui, Miao, Yanzi, Wang, Hesheng

论文摘要

在现有方法中，LiDAR射测表现出卓越的性能，但视觉探光仪仍被广泛用于其价格优势。从惯例上讲，视觉进程的任务主要依赖于连续图像的输入。但是，探测器网络学习图像提供的异性几何信息非常复杂。在本文中，将伪LIDAR的概念引入了进程中以解决此问题。伪LIDAR点云向后投影将图像生成的深度映射到3D点云中，从而改变了图像表示方式。与立体声映像相比，立体声匹配网络生成的伪LIDAR点云可以得到显式的3D坐标。由于在3D空间中发生了6个自由度（DOF）姿势转换，因此伪宽点云提供的3D结构信息比图像更直接。与稀疏的LiDAR相比，伪驱动器具有较密集的点云。为了充分利用伪LIDAR提供的丰富点云信息，采用了投射感知的探测管道。大多数以前的基于激光雷达的算法从点云中取样了8192点，作为探视网络的输入。投影感知的密集探测管道取下从图像产生的所有伪lidar点云，除了误差点作为网络的输入。在图像中充分利用3D几何信息时，图像中的语义信息也用于探视任务中。 2d-3d的融合是在仅基于图像的进程中实现的。 Kitti数据集的实验证明了我们方法的有效性。据我们所知，这是使用伪LIDAR的第一种视觉探光法。

In the existing methods, LiDAR odometry shows superior performance, but visual odometry is still widely used for its price advantage. Conventionally, the task of visual odometry mainly rely on the input of continuous images. However, it is very complicated for the odometry network to learn the epipolar geometry information provided by the images. In this paper, the concept of pseudo-LiDAR is introduced into the odometry to solve this problem. The pseudo-LiDAR point cloud back-projects the depth map generated by the image into the 3D point cloud, which changes the way of image representation. Compared with the stereo images, the pseudo-LiDAR point cloud generated by the stereo matching network can get the explicit 3D coordinates. Since the 6 Degrees of Freedom (DoF) pose transformation occurs in 3D space, the 3D structure information provided by the pseudo-LiDAR point cloud is more direct than the image. Compared with sparse LiDAR, the pseudo-LiDAR has a denser point cloud. In order to make full use of the rich point cloud information provided by the pseudo-LiDAR, a projection-aware dense odometry pipeline is adopted. Most previous LiDAR-based algorithms sampled 8192 points from the point cloud as input to the odometry network. The projection-aware dense odometry pipeline takes all the pseudo-LiDAR point clouds generated from the images except for the error points as the input to the network. While making full use of the 3D geometric information in the images, the semantic information in the images is also used in the odometry task. The fusion of 2D-3D is achieved in an image-only based odometry. Experiments on the KITTI dataset prove the effectiveness of our method. To the best of our knowledge, this is the first visual odometry method using pseudo-LiDAR.

下载PDF全文

下载文献需遵守相关版权规定

论文标题