论文标题
重新考虑真正无监督的图像到图像翻译
Rethinking the Truly Unsupervised Image-to-Image Translation
论文作者
论文摘要
每个最新的图像到图像转换模型固有地都需要图像级(即输入输出对)或设置级别(即域标签)监督。但是,在实践中,即使是设定级别的监督也可能是数据收集的严重瓶颈。在本文中,我们在完全无监督的设置中处理图像到图像的翻译,即既不配对的图像也不是域标签。为此,我们提出了一个真正无监督的图像到图像翻译模型(TUNIT),该模型同时学习将图像域分开并将输入图像转换为估计的域。实验结果表明,我们的模型比具有完整标签训练的设定级别监督模型可比性甚至更好的性能,在各种数据集上都很好地概括了,并且与选择超参数的选择非常强(例如,伪域的预设数量)。此外,使用一些标记的数据可以轻松地扩展到半监督学习。
Every recent image-to-image translation model inherently requires either image-level (i.e. input-output pairs) or set-level (i.e. domain labels) supervision. However, even set-level supervision can be a severe bottleneck for data collection in practice. In this paper, we tackle image-to-image translation in a fully unsupervised setting, i.e., neither paired images nor domain labels. To this end, we propose a truly unsupervised image-to-image translation model (TUNIT) that simultaneously learns to separate image domains and translates input images into the estimated domains. Experimental results show that our model achieves comparable or even better performance than the set-level supervised model trained with full labels, generalizes well on various datasets, and is robust against the choice of hyperparameters (e.g. the preset number of pseudo domains). Furthermore, TUNIT can be easily extended to semi-supervised learning with a few labeled data.