通过多相机系统中的特定地点子选择改善最坏情况的视觉定位覆盖范围

论文标题

通过多相机系统中的特定地点子选择改善最坏情况的视觉定位覆盖范围

Improving Worst Case Visual Localization Coverage via Place-specific Sub-selection in Multi-camera Systems

论文作者

Hausler, Stephen, Xu, Ming, Garg, Sourav, Chakravarty, Punarjay, Shrivastava, Shubham, Vora, Ankit, Milford, Michael

论文摘要

6-DOF的视觉定位系统利用植根于3D几何形状的原始方法来对图像进行准确的相机姿势估计图像。当前的技术使用分层管道并学到了2D功能提取器来提高可扩展性并提高性能。但是，尽管在典型的召回@0.25m类型的指标中获得了收益，但由于其“最差”的性能领域，这些系统对于自动驾驶汽车（例如自动驾驶汽车）的实用性仍然有限 - 在某些必需的误差耐受性的情况下，它们提供了不足的召回位置。在这里，我们研究了使用“位置特定配置”的实用程序，其中将映射分割为多个位置，每个位置都有自己的配置，用于调节姿势估计步骤，在这种情况下，在多相机系统中选择相机。与使用现成的管道相比，在福特AV基准数据集上，我们证明了最大的最差案例定位性能 - 最小化数据集的百分比，该数据集的百分比以某种误差耐受性以及改善的整体定位性能而降低。我们提出的方法尤其适用于自动驾驶汽车部署的众群体模型，在该模型中，AV机队定期穿越已知的路线。

6-DoF visual localization systems utilize principled approaches rooted in 3D geometry to perform accurate camera pose estimation of images to a map. Current techniques use hierarchical pipelines and learned 2D feature extractors to improve scalability and increase performance. However, despite gains in typical [email protected] type metrics, these systems still have limited utility for real-world applications like autonomous vehicles because of their `worst' areas of performance - the locations where they provide insufficient recall at a certain required error tolerance. Here we investigate the utility of using `place specific configurations', where a map is segmented into a number of places, each with its own configuration for modulating the pose estimation step, in this case selecting a camera within a multi-camera system. On the Ford AV benchmark dataset, we demonstrate substantially improved worst-case localization performance compared to using off-the-shelf pipelines - minimizing the percentage of the dataset which has low recall at a certain error tolerance, as well as improved overall localization performance. Our proposed approach is particularly applicable to the crowdsharing model of autonomous vehicle deployment, where a fleet of AVs are regularly traversing a known route.

下载PDF全文

下载文献需遵守相关版权规定

论文标题