论文标题
在不利天气下自动驾驶的双重对比端到端语义细分
Doubly Contrastive End-to-End Semantic Segmentation for Autonomous Driving under Adverse Weather
论文作者
论文摘要
道路现场理解任务最近对于自动驾驶汽车至关重要。尤其是,实时的语义细分对于智能自动驾驶代理以识别驾驶区域的路边对象是必不可少的。由于先前的研究工作主要试图通过计算沉重的操作来改善细分性能,因此它们需要大量的硬件资源来培训和部署,因此不适合实时应用。因此,我们提出了一种双重对比的方法,以提高自动驾驶更实用的轻质模型的性能,特别是在不利的天气条件下,例如雾,夜间,雨水和雪。我们提出的方法在端到端监督的学习方案中利用图像和像素级的对比,而无需记忆库以进行全球一致性或常规对比方法中使用的预训练步骤。我们在ACDC数据集上使用SwiftNet验证了方法的有效性,在该数据集上,它在单个RTX 3080移动GPU上以66.7 fps(2048x1024分辨率)在MIOU(RESNET-18骨干线)中的1.34%提高了1.34%。此外,我们证明,在预先训练清晰的天气图像时,用自我判断替换图像级的监督可相当。
Road scene understanding tasks have recently become crucial for self-driving vehicles. In particular, real-time semantic segmentation is indispensable for intelligent self-driving agents to recognize roadside objects in the driving area. As prior research works have primarily sought to improve the segmentation performance with computationally heavy operations, they require far significant hardware resources for both training and deployment, and thus are not suitable for real-time applications. As such, we propose a doubly contrastive approach to improve the performance of a more practical lightweight model for self-driving, specifically under adverse weather conditions such as fog, nighttime, rain and snow. Our proposed approach exploits both image- and pixel-level contrasts in an end-to-end supervised learning scheme without requiring a memory bank for global consistency or the pretraining step used in conventional contrastive methods. We validate the effectiveness of our method using SwiftNet on the ACDC dataset, where it achieves up to 1.34%p improvement in mIoU (ResNet-18 backbone) at 66.7 FPS (2048x1024 resolution) on a single RTX 3080 Mobile GPU at inference. Furthermore, we demonstrate that replacing image-level supervision with self-supervision achieves comparable performance when pre-trained with clear weather images.