论文标题
利用深击缩小面部捕获管道中真实图像和合成图像之间的域间隙
Leveraging Deepfakes to Close the Domain Gap between Real and Synthetic Images in Facial Capture Pipelines
论文作者
论文摘要
我们提出了一条端到端的管道,用于从个性化野外(手机,网络摄像头,YouTube剪辑等)的构建和跟踪3D面部模型的视频数据。首先,我们提出了一种基于传统计算机图形管道中碰撞检测算法的典型的分层聚类框架自动数据策划和检索的方法。随后,我们利用合成的转盘并利用DeepFake技术来构建合成的多视图立体管道,以进行外观捕获,以稳健地进行合成几何和图像错位。最终的模型与动画钻机拟合,然后将其用于跟踪面部表演。值得注意的是,我们对DeepFake技术的新颖使用使我们能够使用可区分的渲染器对野外数据进行强有力的跟踪,尽管显着的合成到现实的域间隙。最后,我们概述了如何训练运动捕获回归器,利用上述技术来避免需要现实世界地面真相数据和/或高端校准的摄像机捕获设置。
We propose an end-to-end pipeline for both building and tracking 3D facial models from personalized in-the-wild (cellphone, webcam, youtube clips, etc.) video data. First, we present a method for automatic data curation and retrieval based on a hierarchical clustering framework typical of collision detection algorithms in traditional computer graphics pipelines. Subsequently, we utilize synthetic turntables and leverage deepfake technology in order to build a synthetic multi-view stereo pipeline for appearance capture that is robust to imperfect synthetic geometry and image misalignment. The resulting model is fit with an animation rig, which is then used to track facial performances. Notably, our novel use of deepfake technology enables us to perform robust tracking of in-the-wild data using differentiable renderers despite a significant synthetic-to-real domain gap. Finally, we outline how we train a motion capture regressor, leveraging the aforementioned techniques to avoid the need for real-world ground truth data and/or a high-end calibrated camera capture setup.