最佳文本模型的提示以及如何找到它们

论文标题

最佳文本模型的提示以及如何找到它们

Best Prompts for Text-to-Image Models and How to Find Them

论文作者

Pavlichenko, Nikita, Ustalov, Dmitry

论文摘要

生成模型的最新进展，尤其是在文本引导的扩散模型中，使得能够生产出与专业人类艺术家作品相似的美学图像。但是，必须仔细撰写称为提示的文本描述，并使用一组澄清的关键字进行扩展。由于美学在计算上的评估具有挑战性，因此需要人类反馈来确定最佳的及时及时组合和关键字组合。在本文中，我们提出了一种使用遗传算法来学习及时关键字最有用的组合的人类方法。我们还展示了这种方法如何改善描述相同描述的图像的美学吸引力。

Recent progress in generative models, especially in text-guided diffusion models, has enabled the production of aesthetically-pleasing imagery resembling the works of professional human artists. However, one has to carefully compose the textual description, called the prompt, and augment it with a set of clarifying keywords. Since aesthetics are challenging to evaluate computationally, human feedback is needed to determine the optimal prompt formulation and keyword combination. In this paper, we present a human-in-the-loop approach to learning the most useful combination of prompt keywords using a genetic algorithm. We also show how such an approach can improve the aesthetic appeal of images depicting the same descriptions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题