论文标题

clocs:摄像机范围对象候选3D对象检测

CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection

论文作者

Pang, Su, Morris, Daniel, Radha, Hayder

论文摘要

使用LIDAR和2D对象检测使用视频,神经网络的3D对象检测都取得了重大进展。但是,训练网络很难以一种证明单模式网络增益的方式有效地使用这两种模式。在本文中,我们提出了一个新颖的相机范围候选对象(CLOC)融合网络。 Clocs Fusion提供了一个低复杂性的多模式融合框架,可显着提高单模式探测器的性能。 CLOC在任何2D和任何3D检测器的非最大抑制(NMS)之前对组合的输出候选者进行操作,并经过培训以利用其几何和语义一致性,以产生更准确的最终3D和2D检测结果。我们对具有挑战性的Kitti对象检测基准(包括3D和Bird's Eye View指标)的实验评估显示出显着改善,尤其是在长距离的基于最新的融合方法。在提交时,Clocs在官方Kitti排行榜中所有基于融合的方法中排名最高。我们将在接受后发布代码。

There have been significant advances in neural networks for both 3D object detection using LiDAR and 2D object detection using video. However, it has been surprisingly difficult to train networks to effectively use both modalities in a way that demonstrates gain over single-modality networks. In this paper, we propose a novel Camera-LiDAR Object Candidates (CLOCs) fusion network. CLOCs fusion provides a low-complexity multi-modal fusion framework that significantly improves the performance of single-modality detectors. CLOCs operates on the combined output candidates before Non-Maximum Suppression (NMS) of any 2D and any 3D detector, and is trained to leverage their geometric and semantic consistencies to produce more accurate final 3D and 2D detection results. Our experimental evaluation on the challenging KITTI object detection benchmark, including 3D and bird's eye view metrics, shows significant improvements, especially at long distance, over the state-of-the-art fusion based methods. At time of submission, CLOCs ranks the highest among all the fusion-based methods in the official KITTI leaderboard. We will release our code upon acceptance.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源