对物体姿势恢复的审查：从3D边界盒探测器到完整的6D姿势估计器

论文标题

对物体姿势恢复的审查：从3D边界盒探测器到完整的6D姿势估计器

A Review on Object Pose Recovery: from 3D Bounding Box Detectors to Full 6D Pose Estimators

论文作者

Sahin, Caner, Garcia-Hernando, Guillermo, Sock, Juil, Kim, Tae-Kyun

论文摘要

物体姿势恢复在计算机视野领域的关注越来越多，因为它已成为与自动驾驶，机器人技术和增强现实有关的迅速发展的技术领域的重要问题。现有与审查有关的研究已经解决了2D视觉级别的问题，浏览了在RGB图像中产生2D边界对象的方法。使用3D空间中可用的几何信息以及RGB（Mono/Stereo）图像来扩大2D搜索空间，或者利用LIDAR传感器和/或RGB-D摄像头的深度数据。在重力对齐图像上评估了3D边界盒探测器，生成类别级的Amodal 3D边界框，而完整的6D对象姿势估计器大多在实例级别上测试了对齐约束的图像。最近，6D对象姿势估计已在类别级别上解决。在本文中，我们介绍了对物体姿势恢复方法的第一个全面的，最新的评论，从3D边界盒探测器到完整的6D姿势估计器。该方法数学上将问题模拟为分类，回归，分类和回归，模板匹配和点对功能匹配任务。基于此，建立了基于数学模型的基于数学模型的分类。研究用于评估方法的数据集在挑战方面进行了研究，并研究了评估指标。分析了文献实验的定量结果，以表明哪些方法在哪些类型的挑战中最能执行。分析是进一步扩展的，比较了两种方法，即我们自己的实现，以便进一步巩固了公众结果的结果。关于对象姿势恢复，该领域的当前位置总结了，并确定了可能的研究方向。

Object pose recovery has gained increasing attention in the computer vision field as it has become an important problem in rapidly evolving technological areas related to autonomous driving, robotics, and augmented reality. Existing review-related studies have addressed the problem at visual level in 2D, going through the methods which produce 2D bounding boxes of objects of interest in RGB images. The 2D search space is enlarged either using the geometry information available in the 3D space along with RGB (Mono/Stereo) images, or utilizing depth data from LIDAR sensors and/or RGB-D cameras. 3D bounding box detectors, producing category-level amodal 3D bounding boxes, are evaluated on gravity aligned images, while full 6D object pose estimators are mostly tested at instance-level on the images where the alignment constraint is removed. Recently, 6D object pose estimation is tackled at the level of categories. In this paper, we present the first comprehensive and most recent review of the methods on object pose recovery, from 3D bounding box detectors to full 6D pose estimators. The methods mathematically model the problem as a classification, regression, classification & regression, template matching, and point-pair feature matching task. Based on this, a mathematical-model-based categorization of the methods is established. Datasets used for evaluating the methods are investigated with respect to the challenges, and evaluation metrics are studied. Quantitative results of experiments in the literature are analyzed to show which category of methods best performs across what types of challenges. The analyses are further extended comparing two methods, which are our own implementations, so that the outcomes from the public results are further solidified. Current position of the field is summarized regarding object pose recovery, and possible research directions are identified.

下载PDF全文

下载文献需遵守相关版权规定

论文标题