论文标题
StyleClipDraw:在文本到绘制翻译中的耦合内容和样式
StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Translation
论文作者
论文摘要
随着剪辑图像文本编码器模型的发布,使用机器学习适合给定文本描述的图像大大改善了。但是,当前的方法缺乏对要生成的图像风格的艺术控制。我们提出了一种为给定文本描述生成样式图纸的方法,用户可以使用示例图像指定所需的图纸样式。受到艺术理论的启发,即在创作过程中,风格和内容通常是密不可分的,我们提出了一种耦合方法,此处称为StyleClipDraw,该方法是通过在整个过程中同时对样式和内容进行优化而生成的,而不是在整个过程中同时优化样式和内容。基于人类评估,StyleClipDraw产生的图像风格强烈优先于顺序方法。尽管内容生成的质量降低了某些样式,但总体而言,考虑到content \ textit {and}样式,styleClipDraw被发现更为优先,这表明了机器生成的图像的风格,外观和感觉的重要性,并表明样式在图纸过程中耦合在一起。 Our code (https://github.com/pschaldenbrand/StyleCLIPDraw), a demonstration (https://replicate.com/pschaldenbrand/style-clip-draw), and style evaluation data (https://www.kaggle.com/pittsburghskeet/drawings-with-style-evaluation-styleclipdraw) are publicly available.
Generating images that fit a given text description using machine learning has improved greatly with the release of technologies such as the CLIP image-text encoder model; however, current methods lack artistic control of the style of image to be generated. We present an approach for generating styled drawings for a given text description where a user can specify a desired drawing style using a sample image. Inspired by a theory in art that style and content are generally inseparable during the creative process, we propose a coupled approach, known here as StyleCLIPDraw, whereby the drawing is generated by optimizing for style and content simultaneously throughout the process as opposed to applying style transfer after creating content in a sequence. Based on human evaluation, the styles of images generated by StyleCLIPDraw are strongly preferred to those by the sequential approach. Although the quality of content generation degrades for certain styles, overall considering both content \textit{and} style, StyleCLIPDraw is found far more preferred, indicating the importance of style, look, and feel of machine generated images to people as well as indicating that style is coupled in the drawing process itself. Our code (https://github.com/pschaldenbrand/StyleCLIPDraw), a demonstration (https://replicate.com/pschaldenbrand/style-clip-draw), and style evaluation data (https://www.kaggle.com/pittsburghskeet/drawings-with-style-evaluation-styleclipdraw) are publicly available.