无监督的结构一致图像到图像翻译

论文标题

无监督的结构一致图像到图像翻译

Unsupervised Structure-Consistent Image-to-Image Translation

论文作者

Shahfar, Shima, Poullis, Charalambos

论文摘要

交换自动编码器在深层图像操纵和图像到图像翻译中实现了最先进的性能。我们通过基于梯度逆转层引入一个简单而有效的辅助模块来改善这项工作。辅助模块的损失迫使发电机学会使用全零纹理代码重建图像，从而鼓励结构和纹理信息之间更好地解开。所提出的基于属性的转移方法可以在样式传输中进行精致的控制，同时在不使用语义掩码的情况下保留结构信息。为了操纵图像，我们将对象的几何形状和输入图像的一般样式编码为两个潜在代码，并具有一个附加约束，可以强制执行结构一致性。此外，由于辅助损失，训练时间大大减少。所提出的模型的优越性在复杂的域（例如卫星图像）中得到了证明，其中已知最先进的失败。最后，我们表明我们的模型改善了广泛的数据集的质量指标，同时通过多模式图像生成技术实现了可比的结果。

The Swapping Autoencoder achieved state-of-the-art performance in deep image manipulation and image-to-image translation. We improve this work by introducing a simple yet effective auxiliary module based on gradient reversal layers. The auxiliary module's loss forces the generator to learn to reconstruct an image with an all-zero texture code, encouraging better disentanglement between the structure and texture information. The proposed attribute-based transfer method enables refined control in style transfer while preserving structural information without using a semantic mask. To manipulate an image, we encode both the geometry of the objects and the general style of the input images into two latent codes with an additional constraint that enforces structure consistency. Moreover, due to the auxiliary loss, training time is significantly reduced. The superiority of the proposed model is demonstrated in complex domains such as satellite images where state-of-the-art are known to fail. Lastly, we show that our model improves the quality metrics for a wide range of datasets while achieving comparable results with multi-modal image generation techniques.

下载PDF全文

下载文献需遵守相关版权规定

论文标题