论文标题
预测摄像机的观点改善了3D人姿势估计的跨数据集概括
Predicting Camera Viewpoint Improves Cross-dataset Generalization for 3D Human Pose Estimation
论文作者
论文摘要
3D人姿势的单眼估计吸引了越来越多的关注,而大型地面运动捕获数据集的可用性。但是,可用的培训数据的多样性是有限的,尚不清楚在训练其特定数据集之外的多大程度方法。在这项工作中,我们对特定数据集中存在的多样性和偏见进行了系统的研究,及其对5个姿势数据集的汇编对跨数据集泛化的影响。我们特别关注相对于车身以身体为中心的坐标框架的相机观点分布的系统差异。基于此观察,我们提出了一个辅助任务,除了姿势外,还可以预测相机观点。我们发现,经过训练的模型可以共同预测观点并系统地显示出显着改善的跨数据集泛化。
Monocular estimation of 3d human pose has attracted increased attention with the availability of large ground-truth motion capture datasets. However, the diversity of training data available is limited and it is not clear to what extent methods generalize outside the specific datasets they are trained on. In this work we carry out a systematic study of the diversity and biases present in specific datasets and its effect on cross-dataset generalization across a compendium of 5 pose datasets. We specifically focus on systematic differences in the distribution of camera viewpoints relative to a body-centered coordinate frame. Based on this observation, we propose an auxiliary task of predicting the camera viewpoint in addition to pose. We find that models trained to jointly predict viewpoint and pose systematically show significantly improved cross-dataset generalization.