论文标题

生成工匠:语义意识和可控制的剪贴画

Generative Artisan: A Semantic-Aware and Controllable CLIPstyler

论文作者

Yang, Zhenling, Song, Huacheng, Wu, Qiunan

论文摘要

回想一下,当前大多数图像样式传输方法都要求用户提供特定样式的图像,然后提取该样式功能和纹理以生成图像样式,但是仍然存在一些问题:用户可能没有参考样式图像,或者可能很难用一个图像来汇总所需的样式。最近提出的夹板解决了此问题,该问题仅根据提供的样式图像的描述来执行样式转移。尽管当景观或肖像单独出现时,ClipStyler可以取得良好的性能,但是当人们和风景共存时,它可能会模糊人们并失去原始的语义。基于这些问题,我们演示了一个新颖的框架,该框架使用了预训练的剪辑文本图像嵌入模型,并通过FCN语义分割网络指导图像样式转移。具体而言,我们解决了与人类受试者相机的自拍照和现实世界景观的肖像过度风格问题,增强了肖像和景观中样式转移效果之间的对比,并在不同语义部分中完全可以控制的图像样式转移程度。我们的生成工匠解决了夹具的失败情况,并产生定性和定量方法,以证明我们的效果比在自拍照和人类受试者照片中的自拍照和现实世界景观中都具有更好的结果。这种改进使我们可以将我们的业务场景框架(例如修饰图形软件)进行商业化。

Recall that most of the current image style transfer methods require the user to give an image of a particular style and then extract that styling feature and texture to generate the style of an image, but there are still some problems: the user may not have a reference style image, or it may be difficult to summarise the desired style in mind with just one image. The recently proposed CLIPstyler has solved this problem, which is able to perform style transfer based only on the provided description of the style image. Although CLIPstyler can achieve good performance when landscapes or portraits appear alone, it can blur the people and lose the original semantics when people and landscapes coexist. Based on these issues, we demonstrate a novel framework that uses a pre-trained CLIP text-image embedding model and guides image style transfer through an FCN semantic segmentation network. Specifically, we solve the portrait over-styling problem for both selfies and real-world landscape with human subjects photos, enhance the contrast between the effect of style transfer in portrait and landscape, and make the degree of image style transfer in different semantic parts fully controllable. Our Generative Artisan resolve the failure case of CLIPstyler and yield both qualitative and quantitative methods to prove ours have much better results than CLIPstyler in both selfies and real-world landscape with human subjects photos. This improvement makes it possible to commercialize our framework for business scenarios such as retouching graphics software.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源