基于多模式传感器融合的深度神经网络，用于端到端自动驾驶，并了解场景

论文标题

基于多模式传感器融合的深度神经网络，用于端到端自动驾驶，并了解场景

Multi-modal Sensor Fusion-Based Deep Neural Network for End-to-end Autonomous Driving with Scene Understanding

论文作者

Huang, Zhiyu, Lv, Chen, Xing, Yang, Wu, Jingda

论文摘要

这项研究旨在提高端到端自动驾驶的性能和概括能力，并了解场景的理解，了解深度学习和多模式传感器融合技术。设计的端到端深神经网络将视觉图像和相关的深度信息作为早期融合级别输入，并将像素的语义分割输出，作为场景理解和车辆控制命令，同时同时使用。基于端到端的深度学习自主驾驶模型在高保真模拟的城市驾驶条件下进行了测试，并与Corl2017和Nocrash的基准进行了比较。测试结果表明，所提出的方法具有更好的性能和泛化能力，在培训和未观察到的情况下，在静态导航任务中取得了100％的成功率，以及其他任务中的成功率更好。一项进一步的消融研究表明，由于有错误的感知，通过去除多模式传感器融合或场景理解的模型在新环境中变得苍白。结果证明了我们的模型的性能通过多模式传感器融合的协同作用与场景理解子任务的协同作用，证明了带有多模式传感器融合的发达深神经网络的可行性和有效性。

This study aims to improve the performance and generalization capability of end-to-end autonomous driving with scene understanding leveraging deep learning and multimodal sensor fusion techniques. The designed end-to-end deep neural network takes as input the visual image and associated depth information in an early fusion level and outputs the pixel-wise semantic segmentation as scene understanding and vehicle control commands concurrently. The end-to-end deep learning-based autonomous driving model is tested in high-fidelity simulated urban driving conditions and compared with the benchmark of CoRL2017 and NoCrash. The testing results show that the proposed approach is of better performance and generalization ability, achieving a 100% success rate in static navigation tasks in both training and unobserved situations, as well as better success rates in other tasks than the prior models. A further ablation study shows that the model with the removal of multimodal sensor fusion or scene understanding pales in the new environment because of the false perception. The results verify that the performance of our model is improved by the synergy of multimodal sensor fusion with scene understanding subtask, demonstrating the feasibility and effectiveness of the developed deep neural network with multimodal sensor fusion.

下载PDF全文

下载文献需遵守相关版权规定

论文标题