通过进化训练数据级联的深单目生3D人姿势估计

论文标题

通过进化训练数据级联的深单目生3D人姿势估计

Cascaded deep monocular 3D human pose estimation with evolutionary training data

论文作者

Li, Shichao, Ke, Lei, Pratama, Kevin, Tai, Yu-Wing, Tang, Chi-Keung, Cheng, Kwang-Ting

论文摘要

端到端的深度表示学习对于单眼3D人类姿势估计取得了显着的准确性，但是这些模型可能会因有限和固定的训练数据而无法看见的姿势失败。本文提出了一种新的数据增强方法，该方法可扩展用于综合大量训练数据（超过800万有效的3D人类姿势，具有相应的2D投影），用于培训2D到3D网络，（2）可以有效地减少数据集偏见。我们的方法将有限的数据集进化为基于由先验知识启发的分层人类代表和启发式方法综合看不见的3D人类骨骼。广泛的实验表明，我们的方法不仅在最大的公共基准上实现了最先进的准确性，而且还可以明显地概括地看不见和稀有姿势。该HTTPS URL可用代码，预训练的模型和工具。

End-to-end deep representation learning has achieved remarkable accuracy for monocular 3D human pose estimation, yet these models may fail for unseen poses with limited and fixed training data. This paper proposes a novel data augmentation method that: (1) is scalable for synthesizing massive amount of training data (over 8 million valid 3D human poses with corresponding 2D projections) for training 2D-to-3D networks, (2) can effectively reduce dataset bias. Our method evolves a limited dataset to synthesize unseen 3D human skeletons based on a hierarchical human representation and heuristics inspired by prior knowledge. Extensive experiments show that our approach not only achieves state-of-the-art accuracy on the largest public benchmark, but also generalizes significantly better to unseen and rare poses. Code, pre-trained models and tools are available at this HTTPS URL.

下载PDF全文

下载文献需遵守相关版权规定

论文标题