带有自适应记忆的深视觉探子仪

论文标题

带有自适应记忆的深视觉探子仪

Deep Visual Odometry with Adaptive Memory

论文作者

Xue, Fei, Wang, Xin, Wang, Junqiu, Zha, Hongbin

论文摘要

我们提出了一种新颖的深视觉探光法（VO）方法，该方法通过选择记忆和精炼姿势来考虑全局信息。现有的基于学习的方法将VO任务作为纯跟踪问题，通过从图像片段中恢复相机姿势，从而导致严重的错误积累。全球信息对于减轻累积错误至关重要。但是，有效保留此类信息的端到端系统是一项挑战。为了应对这一挑战，我们设计了一个自适应内存模块，该模块逐渐和适应性地将信息从本地到全球保存在内存的神经类似物中，从而使我们的系统能够处理长期的依赖性。从内存中的全局信息中受益，先前的结果将通过一个额外的精炼模块进一步完善。在以前的输出的指导下，我们根据特征域的共同可见度对每个视图的特征采用空间上的关注。具体而言，我们的体系结构包括跟踪，记住和精炼模块，而不是跟踪。 Kitti和Tum-RGBD数据集的实验表明，我们的方法以大幅度的优于最先进的方法，并在常规场景中对经典方法产生竞争成果。此外，我们的模型在具有挑战性的场景（例如无纹理区域和突然动议）中取得了出色的表现，而经典算法往往会失败。

We propose a novel deep visual odometry (VO) method that considers global information by selecting memory and refining poses. Existing learning-based methods take the VO task as a pure tracking problem via recovering camera poses from image snippets, leading to severe error accumulation. Global information is crucial for alleviating accumulated errors. However, it is challenging to effectively preserve such information for end-to-end systems. To deal with this challenge, we design an adaptive memory module, which progressively and adaptively saves the information from local to global in a neural analogue of memory, enabling our system to process long-term dependency. Benefiting from global information in the memory, previous results are further refined by an additional refining module. With the guidance of previous outputs, we adopt a spatial-temporal attention to select features for each view based on the co-visibility in feature domain. Specifically, our architecture consisting of Tracking, Remembering and Refining modules works beyond tracking. Experiments on the KITTI and TUM-RGBD datasets demonstrate that our approach outperforms state-of-the-art methods by large margins and produces competitive results against classic approaches in regular scenes. Moreover, our model achieves outstanding performance in challenging scenarios such as texture-less regions and abrupt motions, where classic algorithms tend to fail.

下载PDF全文

下载文献需遵守相关版权规定

论文标题