论文标题
基于多阶段CNN的单眼3D车辆定位和方向估计
Multi-Stage CNN-Based Monocular 3D Vehicle Localization and Orientation Estimation
论文作者
论文摘要
本文旨在通过组合估计的鸟类视图高程图和对象特征的深度表示,从单眼相机拍摄的2D图像设计3D对象检测模型。提出的模型具有预训练的Resnet-50网络作为其后端网络和另外三个分支。该模型首先构建鸟的视图高程图,以估算场景中对象的深度,并使用估计对象的3D边界框。我们已经在两个主要数据集上培训并评估了它:句法数据集和Kiiti数据集。
This paper aims to design a 3D object detection model from 2D images taken by monocular cameras by combining the estimated bird's-eye view elevation map and the deep representation of object features. The proposed model has a pre-trained ResNet-50 network as its backend network and three more branches. The model first builds a bird's-eye view elevation map to estimate the depth of the object in the scene and by using that estimates the object's 3D bounding boxes. We have trained and evaluate it on two major datasets: a syntactic dataset and the KIITI dataset.