自动驾驶的环绕视觉摄像头的整体视觉感知

论文标题

自动驾驶的环绕视觉摄像头的整体视觉感知

Surround-View Cameras based Holistic Visual Perception for Automated Driving

论文作者

Kumar, Varun Ravi

论文摘要

眼睛的形成导致进化的大爆炸。动态从原始生物体等待，等待食物接触以食用视觉传感器所寻求的食物。人眼是进化最复杂的发展之一，但仍然存在缺陷。人类已经发展了一种生物学感知算法，能够驾驶汽车，操作机械，飞行飞机和驾驶数百万年的船只。为计算机自动化这些功能对于包括自动驾驶汽车，增强现实和建筑测量的各种应用至关重要。在自动驾驶汽车的背景下，近场视觉感知可以使环境在$ 0-10 $米和360°覆盖范围内感知到车辆周围的环境。这是开发更安全的自动驾驶的关键决策组成部分。计算机视觉和深度学习的最新进展以及相机和激光镜等高质量传感器的结合增强了成熟的视觉感知解决方案。到目前为止，远场感知一直是主要重点。另一个重要的问题是可用于开发实时应用程序的有限处理能力。由于这种瓶颈，性能和运行时间效率之间经常会取消权衡。我们专注于以下问题以解决这些问题：1）使用卷积神经网络制定各种视觉感知任务（例如几何和语义任务），开发具有高性能和低计算复杂性的近场感知算法。 2）使用多任务学习来克服计算瓶颈，通过在任务之间共享初始卷积层并制定平衡任务的优化策略。

The formation of eyes led to the big bang of evolution. The dynamics changed from a primitive organism waiting for the food to come into contact for eating food being sought after by visual sensors. The human eye is one of the most sophisticated developments of evolution, but it still has defects. Humans have evolved a biological perception algorithm capable of driving cars, operating machinery, piloting aircraft, and navigating ships over millions of years. Automating these capabilities for computers is critical for various applications, including self-driving cars, augmented reality, and architectural surveying. Near-field visual perception in the context of self-driving cars can perceive the environment in a range of $0-10$ meters and 360° coverage around the vehicle. It is a critical decision-making component in the development of safer automated driving. Recent advances in computer vision and deep learning, in conjunction with high-quality sensors such as cameras and LiDARs, have fueled mature visual perception solutions. Until now, far-field perception has been the primary focus. Another significant issue is the limited processing power available for developing real-time applications. Because of this bottleneck, there is frequently a trade-off between performance and run-time efficiency. We concentrate on the following issues in order to address them: 1) Developing near-field perception algorithms with high performance and low computational complexity for various visual perception tasks such as geometric and semantic tasks using convolutional neural networks. 2) Using Multi-Task Learning to overcome computational bottlenecks by sharing initial convolutional layers between tasks and developing optimization strategies that balance tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题