达勒城市：捕获大型文本的城市设计专业知识以形象变形金刚

论文标题

达勒城市：捕获大型文本的城市设计专业知识以形象变形金刚

DALLE-URBAN: Capturing the urban design expertise of large text to image transformers

论文作者

Seneviratne, Sachith, Senanayake, Damith, Rasnayaka, Sanka, Vidanaarachchi, Rajith, Thompson, Jason

论文摘要

使用变压器体系结构将文本描述自动将文本描述转换为图像，最近受到了相当大的关注。这些进步对跨时尚，艺术，建筑，城市规划，景观设计以及此类学科可用的未来工具的许多应用设计学科具有影响。但是，迄今为止尚未进行详细的分析，以捕获此类模型的功能，特别是对建筑环境的关注。在这项工作中，我们研究了此类文本对图像方法的功能和偏见，以详细介绍其详细介绍。我们使用系统的语法来生成与构建环境相关的查询，并评估产生的生成图像。我们生成1020张不同的图像，发现对图像变形金刚的文本在此用例中生成逼真的图像方面具有鲁棒性。可以在GitHub上找到生成的图像：https：//github.com/sachith500/dalleurban

Automatically converting text descriptions into images using transformer architectures has recently received considerable attention. Such advances have implications for many applied design disciplines across fashion, art, architecture, urban planning, landscape design and the future tools available to such disciplines. However, a detailed analysis capturing the capabilities of such models, specifically with a focus on the built environment, has not been performed to date. In this work, we investigate the capabilities and biases of such text-to-image methods as it applies to the built environment in detail. We use a systematic grammar to generate queries related to the built environment and evaluate resulting generated images. We generate 1020 different images and find that text to image transformers are robust at generating realistic images across different domains for this use-case. Generated imagery can be found at the github: https://github.com/sachith500/DALLEURBAN

下载PDF全文

下载文献需遵守相关版权规定

论文标题