论文标题

APB2Facev2:实时音频指导的多面重演

APB2FaceV2: Real-Time Audio-Guided Multi-Face Reenactment

论文作者

Zhang, Jiangning, Zeng, Xianfang, Xu, Chao, Chen, Jun, Liu, Yong, Jiang, Yunliang

论文摘要

音频指导的面部重演旨在产生与输入音频相匹配的影子面孔。但是,当模型经过训练或需要额外的操作,例如生成生动面孔的前提,当前方法只能在模型进行训练或需要额外的操作(例如3D渲染和图像后融合)之后重新制定一个特殊的人。为了解决上述挑战,我们提出了一种小说\ emph {r} eal time \ emph {a} udio引导\ emph \ emph {m} ulti-face重新制作方法,名为\ emph {apb2facev2},这可以重新启动具有相应参考式信号和驱动器信号的多个目标。我们可以训练模型端到端,并具有更快的速度,我们设计了一个名为Adaptive卷积(ADACONV)的新型模块,以将音频信息注入网络中,并采用轻量级网络作为我们的骨架,以便网络可以在CPU和GPU上实时运行。比较实验证明了我们的方法比现有的最新方法的优势,进一步的实验表明,对于实际应用,我们的方法具有有效的灵活性https://github.com/zhangzjn/apb2facev2

Audio-guided face reenactment aims to generate a photorealistic face that has matched facial expression with the input audio. However, current methods can only reenact a special person once the model is trained or need extra operations such as 3D rendering and image post-fusion on the premise of generating vivid faces. To solve the above challenge, we propose a novel \emph{R}eal-time \emph{A}udio-guided \emph{M}ulti-face reenactment approach named \emph{APB2FaceV2}, which can reenact different target faces among multiple persons with corresponding reference face and drive audio signal as inputs. Enabling the model to be trained end-to-end and have a faster speed, we design a novel module named Adaptive Convolution (AdaConv) to infuse audio information into the network, as well as adopt a lightweight network as our backbone so that the network can run in real time on CPU and GPU. Comparison experiments prove the superiority of our approach than existing state-of-the-art methods, and further experiments demonstrate that our method is efficient and flexible for practical applications https://github.com/zhangzjn/APB2FaceV2

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源