论文标题
可控人形象的对应学习
Correspondence Learning for Controllable Person Image Generation
论文作者
论文摘要
如图所示,我们提出了一个可控人图像合成的生成模型,该模型可以应用于姿势指导的人图像合成,即$ $,将源形象的姿势转换为目标姿势,同时保留该来源的人形象的纹理,并保留了源代构成的人形象综合,以及$ $。通过明确建立目标姿势和源图像之间的密集对应关系,我们可以有效地解决姿势tranfer引入的未对准并生成高质量的图像。具体而言,我们首先在目标姿势的指导下生成目标语义图,该图可以在生成过程中提供更准确的姿势表示和结构约束。然后,分解属性编码器用于提取组件特征,这不仅有助于建立更准确的密度对应关系,而且还实现了服装引导的人的产生。之后,我们将在目标姿势和碎片域内的源图像之间建立密集的对应关系。源图像特征根据密集的对应关系扭曲,以灵活地解释变形。最后,网络基于扭曲的源图像功能和目标姿势呈现图像。实验结果表明,我们的方法优于姿势引导的人产生的最新方法及其在服装引导的人产生中的有效性。
We present a generative model for controllable person image synthesis,as shown in Figure , which can be applied to pose-guided person image synthesis, $i.e.$, converting the pose of a source person image to the target pose while preserving the texture of that source person image, and clothing-guided person image synthesis, $i.e.$, changing the clothing texture of a source person image to the desired clothing texture. By explicitly establishing the dense correspondence between the target pose and the source image, we can effectively address the misalignment introduced by pose tranfer and generate high-quality images. Specifically, we first generate the target semantic map under the guidence of the target pose, which can provide more accurate pose representation and structural constraints during the generation process. Then, decomposed attribute encoder is used to extract the component features, which not only helps to establish a more accurate dense correspondence, but also realizes the clothing-guided person generation. After that, we will establish a dense correspondence between the target pose and the source image within the sharded domain. The source image feature is warped according to the dense correspondence to flexibly account for deformations. Finally, the network renders image based on the warped source image feature and the target pose. Experimental results show that our method is superior to state-of-the-art methods in pose-guided person generation and its effectiveness in clothing-guided person generation.