LayoutTransFormer：自我注意的布局生成和完成

论文标题

LayoutTransFormer：自我注意的布局生成和完成

LayoutTransformer: Layout Generation and Completion with Self-attention

论文作者

Gupta, Kamal, Lazarow, Justin, Achille, Alessandro, Davis, Larry, Mahadevan, Vijay, Shrivastava, Abhinav

论文摘要

我们解决了针对图像，移动应用程序，文档和3D对象等不同域的场景布局生成问题。自然或人工设计的大多数复杂场景可以表示为简单的图形原始图的有意义的布置。生成新的布局或扩展现有布局需要了解这些原始人之间的关系。为此，我们提出了LayoutTransFormer，这是一个新颖的框架，利用自我意见来学习布局元素之间的上下文关系，并在给定域中生成新颖的布局。我们的框架使我们能够从一个空集或初始种子集中生成新的布局，并且可以轻松地扩展以支持每个布局的任意原语。此外，我们的分析表明，该模型能够自动捕获原语的语义属性。我们提出了在布局原语的表示方面的简单改进，以及训练方法，以在非常多样化的数据域中证明竞争性能，例如自然图像（可可边界框）中的对象边界框，文档（PUBLAYNET），移动应用程序（RICO数据集）以及3D形状（part-Net）。代码和其他材料将在https://kampta.github.io/layout上提供。

We address the problem of scene layout generation for diverse domains such as images, mobile applications, documents, and 3D objects. Most complex scenes, natural or human-designed, can be expressed as a meaningful arrangement of simpler compositional graphical primitives. Generating a new layout or extending an existing layout requires understanding the relationships between these primitives. To do this, we propose LayoutTransformer, a novel framework that leverages self-attention to learn contextual relationships between layout elements and generate novel layouts in a given domain. Our framework allows us to generate a new layout either from an empty set or from an initial seed set of primitives, and can easily scale to support an arbitrary of primitives per layout. Furthermore, our analyses show that the model is able to automatically capture the semantic properties of the primitives. We propose simple improvements in both representation of layout primitives, as well as training methods to demonstrate competitive performance in very diverse data domains such as object bounding boxes in natural images(COCO bounding box), documents (PubLayNet), mobile applications (RICO dataset) as well as 3D shapes (Part-Net). Code and other materials will be made available at https://kampta.github.io/layout.

下载PDF全文

下载文献需遵守相关版权规定

论文标题