学习基于序列的视觉位置识别的顺序描述符

论文标题

学习基于序列的视觉位置识别的顺序描述符

Learning Sequential Descriptors for Sequence-based Visual Place Recognition

论文作者

Mereu, Riccardo, Trivigno, Gabriele, Berton, Gabriele, Masone, Carlo, Caputo, Barbara

论文摘要

在机器人技术中，Visual Place识别是一个连续的过程，它作为输入视频流以产生机器人在已知位置地图中的当前位置的假设的输入。此任务需要针对实际应用的强大，可扩展和高效的技术。这项工作提出了使用顺序描述符对技术的详细分类法，突出了不同的机制，以融合各个图像的信息。实验结果的完整基准支持了这种分类，该基准提供了有关这些不同建筑选择的优势和劣势的证据。与现有的顺序描述方法相比，我们进一步研究了变压器而不是CNN骨架的生存能力，我们提出了一种称为SEQVLAD的新的临时序列级聚合器，该序列级别的聚合器在不同的数据集中胜过先前的艺术状态。该代码可从https://github.com/vandal-vpr/vg-transformers获得。

In robotics, Visual Place Recognition is a continuous process that receives as input a video stream to produce a hypothesis of the robot's current position within a map of known places. This task requires robust, scalable, and efficient techniques for real applications. This work proposes a detailed taxonomy of techniques using sequential descriptors, highlighting different mechanism to fuse the information from the individual images. This categorization is supported by a complete benchmark of experimental results that provides evidence on the strengths and weaknesses of these different architectural choices. In comparison to existing sequential descriptors methods, we further investigate the viability of Transformers instead of CNN backbones, and we propose a new ad-hoc sequence-level aggregator called SeqVLAD, which outperforms prior state of the art on different datasets. The code is available at https://github.com/vandal-vpr/vg-transformers.

下载PDF全文

下载文献需遵守相关版权规定

论文标题