从现实世界的视频中学习面孔的逆渲染

论文标题

从现实世界的视频中学习面孔的逆渲染

Learning Inverse Rendering of Faces from Real-world Videos

论文作者

Qiu, Yuda, Xiong, Zhangyang, Han, Kai, Wang, Zhongyuan, Xiong, Zixiang, Han, Xiaoguang

论文摘要

在本文中，我们研究了真实面部图像反向渲染的问题。现有方法通过监督合成面部数据的培训将面部图像分解为三个组件（反照率，正常和照明）。但是，由于实际和合成面部图像之间的域间隙，经过合成数据训练的模型通常不能很好地推广到真实数据。同时，由于没有任何组件的基本真相可用于真实图像，因此在真实面部图像上进行监督学习是不可行的。为了减轻这个问题，我们提出了一种弱监督的训练方法，以基于在不同框架上的反照率和正常状态的一致性的假设，从而在真实的面部视频上训练我们的模型，从而弥合了真实和合成面部图像之间的差距。此外，我们引入了一个名为Inrures-SFSNet的学习框架，以进一步提取残差图，以捕获全球照明效果，从而赋予现有方法中很大程度上忽略的细节。我们的网络都受到实际数据和合成数据的培训，这两者都受益。我们在各种基准上全面评估了我们的方法，比最先进的结果获得了更好的逆渲染结果。

In this paper we examine the problem of inverse rendering of real face images. Existing methods decompose a face image into three components (albedo, normal, and illumination) by supervised training on synthetic face data. However, due to the domain gap between real and synthetic face images, a model trained on synthetic data often does not generalize well to real data. Meanwhile, since no ground truth for any component is available for real images, it is not feasible to conduct supervised learning on real face images. To alleviate this problem, we propose a weakly supervised training approach to train our model on real face videos, based on the assumption of consistency of albedo and normal across different frames, thus bridging the gap between real and synthetic face images. In addition, we introduce a learning framework, called IlluRes-SfSNet, to further extract the residual map to capture the global illumination effects that give the fine details that are largely ignored in existing methods. Our network is trained on both real and synthetic data, benefiting from both. We comprehensively evaluate our methods on various benchmarks, obtaining better inverse rendering results than the state-of-the-art.

下载PDF全文

下载文献需遵守相关版权规定

论文标题