论文标题
使用单眼RGB估计双手全局3D姿势
Two-hand Global 3D Pose Estimation Using Monocular RGB
论文作者
论文摘要
我们仅通过单眼RGB输入图像来估算双手全球3D联合位置的挑战性任务。我们提出了一种新型的多阶段卷积神经网络管道,尽管两只手与复杂的背景噪声在固定之间,但仍能准确地片段并定位手,并估算了2D和3D规范的关节位置,而没有任何深度信息。使用手动姿势估计和使用新型投影算法的钥匙骨的实际长度计算相对于相机起源的全球关节位置。为了培训CNN的这项新任务,我们引入了一个大规模的合成3D手姿势数据集。我们证明,我们的系统在3D规范手姿势估计基准数据集上均优于先前的作品,其中包含仅RGB信息。此外,我们介绍了第一批使用仅RGB的输入在双手上实现全局3D手动跟踪的第一项工作,并提供了广泛的定量和定性评估。
We tackle the challenging task of estimating global 3D joint locations for both hands via only monocular RGB input images. We propose a novel multi-stage convolutional neural network based pipeline that accurately segments and locates the hands despite occlusion between two hands and complex background noise and estimates the 2D and 3D canonical joint locations without any depth information. Global joint locations with respect to the camera origin are computed using the hand pose estimations and the actual length of the key bone with a novel projection algorithm. To train the CNNs for this new task, we introduce a large-scale synthetic 3D hand pose dataset. We demonstrate that our system outperforms previous works on 3D canonical hand pose estimation benchmark datasets with RGB-only information. Additionally, we present the first work that achieves accurate global 3D hand tracking on both hands using RGB-only inputs and provide extensive quantitative and qualitative evaluation.