Lidarmultinet：单个多任务网络中的LIDAR语义分割，3D对象检测和全盘分段

论文标题

Lidarmultinet：单个多任务网络中的LIDAR语义分割，3D对象检测和全盘分段

LidarMultiNet: Unifying LiDAR Semantic Segmentation, 3D Object Detection, and Panoptic Segmentation in a Single Multi-task Network

论文作者

Ye, Dongqiangzi, Chen, Weijia, Zhou, Zixiang, Xie, Yufei, Wang, Yu, Wang, Panqu, Foroosh, Hassan

论文摘要

该技术报告介绍了Waymo打开数据集3D语义分割挑战2022的第一名获胜解决方案。我们的网络称为Lidarmultinet，统一了单个框架中的3D语义细分，对象检测和全盘分割的主要激光感知任务。 Lidarmultinet的核心是一个强大的基于3D Voxel的编码器网络，其新型全球上下文池（GCP）模块从激光雷达框架中提取全局上下文特征，以补充其本地功能。提出了一个可选的第二阶段，以完善第一阶段的分割或生成准确的全景分割结果。我们的解决方案达到了71.13的MIOU，对于Waymo 3D语义细分测试集的22个类中的大多数是最好的，在官方排行榜上表现优于其他所有3D语义分段方法。我们首次证明，可以在可以端对端训练的单个强大网络中统一重大激光感知任务。

This technical report presents the 1st place winning solution for the Waymo Open Dataset 3D semantic segmentation challenge 2022. Our network, termed LidarMultiNet, unifies the major LiDAR perception tasks such as 3D semantic segmentation, object detection, and panoptic segmentation in a single framework. At the core of LidarMultiNet is a strong 3D voxel-based encoder-decoder network with a novel Global Context Pooling (GCP) module extracting global contextual features from a LiDAR frame to complement its local features. An optional second stage is proposed to refine the first-stage segmentation or generate accurate panoptic segmentation results. Our solution achieves a mIoU of 71.13 and is the best for most of the 22 classes on the Waymo 3D semantic segmentation test set, outperforming all the other 3D semantic segmentation methods on the official leaderboard. We demonstrate for the first time that major LiDAR perception tasks can be unified in a single strong network that can be trained end-to-end.

下载PDF全文

下载文献需遵守相关版权规定

论文标题