语义图像综合的双重注意gan

论文标题

语义图像综合的双重注意gan

Dual Attention GANs for Semantic Image Synthesis

论文作者

Tang, Hao, Bai, Song, Sebe, Nicu

论文摘要

在本文中，我们着重于语义图像综合任务，该任务旨在将语义标签地图传输到照片现实图像。现有方法缺乏有效的语义限制来保留语义信息并忽略空间和信道维度的结构相关性，从而导致不令人满意的模糊和易于伪影的结果。为了解决这些局限性，我们提出了一种新颖的双重注意gan（dagan），以合成光合逼真的和语义上一致的图像，并通过输入布局中的细节进行了精细的细节，而无需对额外的训练开销或修改现有方法的网络体系结构。我们还提出了两个新型模块，即位置的空间注意模块（SAM）和尺度通道注意模块（CAM），分别在空间和通道维度中捕获语义结构的注意力。具体而言，SAM通过空间注意力图有选择地将每个位置的像素相关联，导致像素具有相同的语义标签，无论其空间距离如何。同时，CAM通过通道注意力图选择性地强调了每个通道在每个通道的规模特征，该图形在所有通道图之间都集成了相关的特征，而不论其尺度如何。我们最终将SAM和CAM的输出列为进一步改善特征表示。在四个具有挑战性的数据集上进行的广泛实验表明，达根（Dagan）在使用较少的模型参数的同时，达根（Dagan）取得的结果比最先进的方法更好。源代码和训练有素的模型可在https://github.com/ha0tang/dagan上找到。

In this paper, we focus on the semantic image synthesis task that aims at transferring semantic label maps to photo-realistic images. Existing methods lack effective semantic constraints to preserve the semantic information and ignore the structural correlations in both spatial and channel dimensions, leading to unsatisfactory blurry and artifact-prone results. To address these limitations, we propose a novel Dual Attention GAN (DAGAN) to synthesize photo-realistic and semantically-consistent images with fine details from the input layouts without imposing extra training overhead or modifying the network architectures of existing methods. We also propose two novel modules, i.e., position-wise Spatial Attention Module (SAM) and scale-wise Channel Attention Module (CAM), to capture semantic structure attention in spatial and channel dimensions, respectively. Specifically, SAM selectively correlates the pixels at each position by a spatial attention map, leading to pixels with the same semantic label being related to each other regardless of their spatial distances. Meanwhile, CAM selectively emphasizes the scale-wise features at each channel by a channel attention map, which integrates associated features among all channel maps regardless of their scales. We finally sum the outputs of SAM and CAM to further improve feature representation. Extensive experiments on four challenging datasets show that DAGAN achieves remarkably better results than state-of-the-art methods, while using fewer model parameters. The source code and trained models are available at https://github.com/Ha0Tang/DAGAN.

下载PDF全文

下载文献需遵守相关版权规定

论文标题