发音感知的规范表面映射

论文标题

发音感知的规范表面映射

Articulation-aware Canonical Surface Mapping

论文作者

Kulkarni, Nilesh, Gupta, Abhinav, Fouhey, David F., Tulsiani, Shubham

论文摘要

我们解决以下任务：1）预测一个规范的表面映射（CSM），该映射指示从2D像素到规范模板形状上的相应点，以及2）推断与输入图像相对应的模板的表达和姿势。尽管以前的方法依靠关键点监督进行学习，但我们提出了一种可以在没有这种注释的情况下学习的方法。我们的关键见解是这些任务与几何相关，我们可以通过在预测之间执行一致性来获得监督信号。我们介绍了各种动物对象类别的结果，表明我们的方法可以从图像收集中学习表达和CSM预测，仅使用前景掩模标签进行培训。我们从经验上表明，允许发音有助于学习更准确的CSM预测，并且与预测的CSM保持一致性对于学习有意义的表达同样至关重要。

We tackle the tasks of: 1) predicting a Canonical Surface Mapping (CSM) that indicates the mapping from 2D pixels to corresponding points on a canonical template shape, and 2) inferring the articulation and pose of the template corresponding to the input image. While previous approaches rely on keypoint supervision for learning, we present an approach that can learn without such annotations. Our key insight is that these tasks are geometrically related, and we can obtain supervisory signal via enforcing consistency among the predictions. We present results across a diverse set of animal object categories, showing that our method can learn articulation and CSM prediction from image collections using only foreground mask labels for training. We empirically show that allowing articulation helps learn more accurate CSM prediction, and that enforcing the consistency with predicted CSM is similarly critical for learning meaningful articulation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题