3D免费：使用高清图的跨模式转移学习

论文标题

3D免费：使用高清图的跨模式转移学习

3D for Free: Crossmodal Transfer Learning using HD Maps

论文作者

Wilson, Benjamin, Kira, Zsolt, Hays, James

论文摘要

3D对象检测是机器人技术和自动驾驶的核心感知挑战。但是，现代自主驾驶数据集中的类税务工程学明显小于许多有影响力的2D检测数据集。在这项工作中，我们通过利用现代2D数据集的大型类税项和最新2D检测方法的鲁棒性来解决长尾问题。我们继续挖掘一个图像和激光雷达的大型，未标记的图像数据集，并估算从现成的2D实例分割模型播种的3D对象边界线圈。至关重要的是，我们通过使用高清图和对象大小的先验来限制这一不足的2D到3D映射。采矿过程的结果是具有不同置信度的3D立方体。这种采矿过程本身就是一个3D对象检测器，尽管在这样的评估时并不特别准确。但是，然后，我们对这些立方体进行了一个3D对象检测模型，这与深度学习文献中的其他最新观察结果一致，我们发现所产生的模型对于我们采矿过程提供的嘈杂监督相当强大。我们从一个自动驾驶汽车中挖掘了1151个未标记的多模式驾驶日志的集合，并使用发现的对象来训练基于激光雷达的对象探测器。我们表明，随着我们挖掘更多未标记的数据，检测器性能会增加。借助我们完整的，未标记的数据集，我们的方法通过完全监督的方法进行了竞争性能，甚至超过了某些对象类别的性能，而没有任何人类的3D注释。

3D object detection is a core perceptual challenge for robotics and autonomous driving. However, the class-taxonomies in modern autonomous driving datasets are significantly smaller than many influential 2D detection datasets. In this work, we address the long-tail problem by leveraging both the large class-taxonomies of modern 2D datasets and the robustness of state-of-the-art 2D detection methods. We proceed to mine a large, unlabeled dataset of images and LiDAR, and estimate 3D object bounding cuboids, seeded from an off-the-shelf 2D instance segmentation model. Critically, we constrain this ill-posed 2D-to-3D mapping by using high-definition maps and object size priors. The result of the mining process is 3D cuboids with varying confidence. This mining process is itself a 3D object detector, although not especially accurate when evaluated as such. However, we then train a 3D object detection model on these cuboids, consistent with other recent observations in the deep learning literature, we find that the resulting model is fairly robust to the noisy supervision that our mining process provides. We mine a collection of 1151 unlabeled, multimodal driving logs from an autonomous vehicle and use the discovered objects to train a LiDAR-based object detector. We show that detector performance increases as we mine more unlabeled data. With our full, unlabeled dataset, our method performs competitively with fully supervised methods, even exceeding the performance for certain object categories, without any human 3D annotations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题