弱3D：朝弱监督的单眼3D对象检测

论文标题

弱3D：朝弱监督的单眼3D对象检测

WeakM3D: Towards Weakly Supervised Monocular 3D Object Detection

论文作者

Peng, Liang, Yan, Senbo, Wu, Boxi, Yang, Zheng, He, Xiaofei, Cai, Deng

论文摘要

单眼3D对象检测是3D场景理解中最具挑战性的任务之一。由于单眼图像的性质不足，现有的单眼3D检测方法高度依赖于liDar点云上手动注释的3D盒标签的训练。这个注释过程非常费力且昂贵。为了消除对3D盒标签的依赖，在本文中，我们探讨了弱监督的单眼3D检测。具体而言，我们首先在图像上检测2D框。然后，我们采用生成的2D框来选择相应的ROI激光点作为弱监管。最终，我们采用一个网络预测3D框，可以与相关的ROI激光雷达点紧密保持一致。通过最大程度地减少3D盒估计和相应的ROI激光雷达点之间的新提供的3D对齐损失来学习该网络。我们将说明上述学习问题的潜在挑战，并通过在我们的方法中引入一些有效的设计来解决这些挑战。代码将在https://github.com/spengliang/weakm3d上找到。

Monocular 3D object detection is one of the most challenging tasks in 3D scene understanding. Due to the ill-posed nature of monocular imagery, existing monocular 3D detection methods highly rely on training with the manually annotated 3D box labels on the LiDAR point clouds. This annotation process is very laborious and expensive. To dispense with the reliance on 3D box labels, in this paper we explore the weakly supervised monocular 3D detection. Specifically, we first detect 2D boxes on the image. Then, we adopt the generated 2D boxes to select corresponding RoI LiDAR points as the weak supervision. Eventually, we adopt a network to predict 3D boxes which can tightly align with associated RoI LiDAR points. This network is learned by minimizing our newly-proposed 3D alignment loss between the 3D box estimates and the corresponding RoI LiDAR points. We will illustrate the potential challenges of the above learning problem and resolve these challenges by introducing several effective designs into our method. Codes will be available at https://github.com/SPengLiang/WeakM3D.

下载PDF全文

下载文献需遵守相关版权规定

论文标题