Sologan：多模式多模式不成对图像到图像的图像翻译通过单个生成对抗网络

论文标题

Sologan：多模式多模式不成对图像到图像的图像翻译通过单个生成对抗网络

SoloGAN: Multi-domain Multimodal Unpaired Image-to-Image Translation via a Single Generative Adversarial Network

论文作者

Huang, Shihua, He, Cheng, Cheng, Ran

论文摘要

尽管具有生成对抗网络（GAN）的图像到图像（I2i）翻译取得了重大进展，但使用一对生成器和歧视器将图像有效地转换为多个目标域中的一组不同图像仍然具有挑战性。现有的I2i翻译方法采用多个针对不同域的特定域内容编码器，其中每个域特定的内容编码器仅经过来自同一域的图像训练。然而，我们认为应从所有域之间的图像中学到内容（域变相）特征。因此，现有方案的每个特定于域的内容编码器都无法有效提取域不变特征。为了解决这个问题，我们提出了一个灵活而通用的Sogan模型，用于具有未配对数据的多个域之间有效的多模式I2I翻译。与现有方法相反，Solgan算法使用单个投影鉴别器，带有附加的辅助分类器，并为所有域共享编码器和生成器。因此，可以使用来自所有领域的图像有效地训练Solgan，从而可以有效提取域 - 交换内容表示表示。在多个数据集中，针对多个同行和sologan的变体的定性和定量结果证明了该方法的优点，尤其是在挑战i2i翻译数据集的挑战，即涉及极端形状变化的数据集或需要保持翻译后不变的复杂背景的数据集。此外，我们通过消融研究证明了Sogan中每个成分的贡献。

Despite significant advances in image-to-image (I2I) translation with generative adversarial networks (GANs), it remains challenging to effectively translate an image to a set of diverse images in multiple target domains using a single pair of generator and discriminator. Existing I2I translation methods adopt multiple domain-specific content encoders for different domains, where each domain-specific content encoder is trained with images from the same domain only. Nevertheless, we argue that the content (domain-invariance) features should be learned from images among all of the domains. Consequently, each domain-specific content encoder of existing schemes fails to extract the domain-invariant features efficiently. To address this issue, we present a flexible and general SoloGAN model for efficient multimodal I2I translation among multiple domains with unpaired data. In contrast to existing methods, the SoloGAN algorithm uses a single projection discriminator with an additional auxiliary classifier and shares the encoder and generator for all domains. Consequently, the SoloGAN can be trained effectively with images from all domains such that the domain-invariance content representation can be efficiently extracted. Qualitative and quantitative results over a wide range of datasets against several counterparts and variants of the SoloGAN demonstrate the merits of the method, especially for challenging I2I translation datasets, i.e., datasets involving extreme shape variations or need to keep the complex backgrounds unchanged after translations. Furthermore, we demonstrate the contribution of each component in SoloGAN by ablation studies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题