论文标题
外观共识驱动的自我监督的人网恢复
Appearance Consensus Driven Self-Supervised Human Mesh Recovery
论文作者
论文摘要
我们提出了一个自制的人网恢复框架,以在没有任何配对监督的情况下从单眼图像中推断出人的姿势和形状。最近的进步将兴趣转移到了参数人模型的直接回归参数,通过在具有2D地标注释的大规模数据集上进行监督。这限制了这种方法在未标记的野生环境中操作图像的普遍性。认识到这一点,我们提出了一种新颖的外观共识驱动的自我监督目标。为了有效地删除前景(FG)人类,我们依靠图像对描绘了相同的姿势和背景(BG)(BG),这些图像是从未标记的野生视频中获得的。提出的FG外观一致性目标利用了一个新颖的,可区分的颜色恢复模块来获得顶点颜色,而无需任何外观网络。通过有效实现色彩挑选和反射对称性。我们在基于标准模型的3D姿势估计基准基准下在可比的监督级别上实现了最新的结果。此外,由此产生的彩色网格预测为我们的框架和形状估计以外的各种外观相关任务打开了我们的框架的用法,从而确立了我们出色的可推广性。
We present a self-supervised human mesh recovery framework to infer human pose and shape from monocular images in the absence of any paired supervision. Recent advances have shifted the interest towards directly regressing parameters of a parametric human model by supervising them on large-scale datasets with 2D landmark annotations. This limits the generalizability of such approaches to operate on images from unlabeled wild environments. Acknowledging this we propose a novel appearance consensus driven self-supervised objective. To effectively disentangle the foreground (FG) human we rely on image pairs depicting the same person (consistent FG) in varied pose and background (BG) which are obtained from unlabeled wild videos. The proposed FG appearance consistency objective makes use of a novel, differentiable Color-recovery module to obtain vertex colors without the need for any appearance network; via efficient realization of color-picking and reflectional symmetry. We achieve state-of-the-art results on the standard model-based 3D pose estimation benchmarks at comparable supervision levels. Furthermore, the resulting colored mesh prediction opens up the usage of our framework for a variety of appearance-related tasks beyond the pose and shape estimation, thus establishing our superior generalizability.