论文标题
6D姿势估算的3D点对点投票网络
3D Point-to-Keypoint Voting Network for 6D Pose Estimation
论文作者
论文摘要
对象6D姿势估计是计算机视觉领域的重要研究主题,因为其广泛的应用要求以及复杂性和现实世界中的变化带来的挑战。我们认为,充分探索点点之间空间关系的特征将有助于提高姿势估计性能,尤其是在背景混乱和部分遮挡场景中。但是,通常使用RGB图像或RGB-D数据在以前的工作中忽略了此信息。在本文中,我们提出了一个基于3D关键点的空间结构特征从RGB-D数据中估算6D姿势估算的框架。我们采用嵌入点的点密集功能来投票支持3D KeyPoints,这充分利用了刚体的结构信息。在CNN预测指向关键点的方向向量之后,我们使用RANSAC投票来计算3D关键点的坐标,然后可以通过最小平方方法轻松获得姿势转换。此外,采用了积分的空间维度采样策略,这使该方法在小型训练集上具有出色的性能。在两个基准数据集(LineMod和coclusion lineMod)上验证了所提出的方法。实验结果表明,我们的方法在lineMod数据集上超过了最新方法,在lineMod数据集上达到了(-s)的精度为98.7 \%,在Occlusion LineMod数据集上实现了52.6 \%。
Object 6D pose estimation is an important research topic in the field of computer vision due to its wide application requirements and the challenges brought by complexity and changes in the real-world. We think fully exploring the characteristics of spatial relationship between points will help to improve the pose estimation performance, especially in the scenes of background clutter and partial occlusion. But this information was usually ignored in previous work using RGB image or RGB-D data. In this paper, we propose a framework for 6D pose estimation from RGB-D data based on spatial structure characteristics of 3D keypoints. We adopt point-wise dense feature embedding to vote for 3D keypoints, which makes full use of the structure information of the rigid body. After the direction vectors pointing to the keypoints are predicted by CNN, we use RANSAC voting to calculate the coordinate of the 3D keypoints, then the pose transformation can be easily obtained by the least square method. In addition, a spatial dimension sampling strategy for points is employed, which makes the method achieve excellent performance on small training sets. The proposed method is verified on two benchmark datasets, LINEMOD and OCCLUSION LINEMOD. The experimental results show that our method outperforms the state-of-the-art approaches, achieves ADD(-S) accuracy of 98.7\% on LINEMOD dataset and 52.6\% on OCCLUSION LINEMOD dataset in real-time.