与搜索对应关系的视觉定位相关地利用本地和全局描述符

论文标题

与搜索对应关系的视觉定位相关地利用本地和全局描述符

Leveraging Local and Global Descriptors in Parallel to Search Correspondences for Visual Localization

论文作者

Zhang, Pengju, Wu, Yihong, Liu, Bingxi

论文摘要

从给定图像中计算6DOF相机姿势的视觉定位具有广泛的应用，例如机器人技术，虚拟现实，增强现实等。两种描述符对于视觉本地化很重要。一个是从每个图像中提取整个功能的全局描述符。另一个是从每个图像补丁中提取本地功能的本地描述符，通常封闭一个钥匙点。越来越多的视觉定位方法具有两个阶段：首先，通过全局描述符，然后从检索反馈执行图像检索，以通过本地描述符进行2D-3D点对应关系。对于大多数方法，这两个阶段都是序列的。这种简单的组合尚未实现融合本地和全球描述符的优势。从检索反馈获得的3D点仅是全局描述符的2D图像点的最近邻居候选。执行2D-3D点对应关系时，2D图像点中的每个图像点也称为查询本地功能。在本文中，我们提出了一个新颖的并行搜索框架，该框架利用本地和全局描述符的优势获得了查询本地功能的最接近的邻居候选人。具体而言，除了使用基于深度学习的全局描述符外，我们还利用本地描述符来构建随机树结构，以获取查询本地特征的最近的邻居候选。在构造随机树时，我们提出了一个新的概率模型和一个新的基于深度学习的本地描述符。提出的局部描述符的损失函数给出了加权汉am的正则术语，以在二进制后保持歧视性。损耗函数共同传输了结果和二进制描述符，结果将结果集成到随机树中。

Visual localization to compute 6DoF camera pose from a given image has wide applications such as in robotics, virtual reality, augmented reality, etc. Two kinds of descriptors are important for the visual localization. One is global descriptors that extract the whole feature from each image. The other is local descriptors that extract the local feature from each image patch usually enclosing a key point. More and more methods of the visual localization have two stages: at first to perform image retrieval by global descriptors and then from the retrieval feedback to make 2D-3D point correspondences by local descriptors. The two stages are in serial for most of the methods. This simple combination has not achieved superiority of fusing local and global descriptors. The 3D points obtained from the retrieval feedback are as the nearest neighbor candidates of the 2D image points only by global descriptors. Each of the 2D image points is also called a query local feature when performing the 2D-3D point correspondences. In this paper, we propose a novel parallel search framework, which leverages advantages of both local and global descriptors to get nearest neighbor candidates of a query local feature. Specifically, besides using deep learning based global descriptors, we also utilize local descriptors to construct random tree structures for obtaining nearest neighbor candidates of the query local feature. We propose a new probabilistic model and a new deep learning based local descriptor when constructing the random trees. A weighted Hamming regularization term to keep discriminativeness after binarization is given in the loss function for the proposed local descriptor. The loss function co-trains both real and binary descriptors of which the results are integrated into the random trees.

下载PDF全文

下载文献需遵守相关版权规定

论文标题