端到端学习改善了单眼视频中的静态对象地理位置化

论文标题

端到端学习改善了单眼视频中的静态对象地理位置化

End-to-end Learning Improves Static Object Geo-localization in Monocular Video

论文作者

Chaabane, Mohamed, Gueguen, Lionel, Trabelsi, Ameni, Beveridge, Ross, O'Hara, Stephen

论文摘要

从自动驾驶汽车的移动相机中准确估算静态物体的位置，例如交通信号灯，这是一个挑战性的问题。在这项工作中，我们提出了一个系统，该系统通过通过学习共同使系统组件共同优化静态对象的定位。我们的系统由执行的网络组成：1）从单个图像中估算5DOF对象姿势，2）对象之间的对象关联和3）多对象跟踪，以产生场景中静态对象的最终地理位置定位。我们使用公开可用的数据集评估了我们的方法，该数据集的重点是数据可用性。对于每个组件，我们将与当代替代方案进行比较，并表现出明显改善的性能。我们还表明，端到端系统性能通过组成模型的联合培训进一步提高。

Accurately estimating the position of static objects, such as traffic lights, from the moving camera of a self-driving car is a challenging problem. In this work, we present a system that improves the localization of static objects by jointly-optimizing the components of the system via learning. Our system is comprised of networks that perform: 1) 5DoF object pose estimation from a single image, 2) association of objects between pairs of frames, and 3) multi-object tracking to produce the final geo-localization of the static objects within the scene. We evaluate our approach using a publicly-available data set, focusing on traffic lights due to data availability. For each component, we compare against contemporary alternatives and show significantly-improved performance. We also show that the end-to-end system performance is further improved via joint-training of the constituent models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题