内存驱动的文本对图像生成

论文标题

内存驱动的文本对图像生成

Memory-Driven Text-to-Image Generation

论文作者

Li, Bowen, Torr, Philip H. S., Lukasiewicz, Thomas

论文摘要

我们为文本对图像生成介绍了一种内存驱动的半参数方法，该方法基于参数和非参数技术。非参数组件是由训练集构建的图像特征的内存库。参数组件是生成对抗网络。给定推断时间的新文本描述，内存库用于选择性检索作为目标图像的基本信息提供的图像功能，这使生成器能够产生逼真的合成结果。我们还将内容信息与语义功能一起纳入歧视器，从而使歧视者可以做出更可靠的预测。实验结果表明，就视觉忠诚度和文本图像语义一致性而言，所提出的记忆驱动的半参数方法比纯粹的参数方法产生的图像更现实。

We introduce a memory-driven semi-parametric approach to text-to-image generation, which is based on both parametric and non-parametric techniques. The non-parametric component is a memory bank of image features constructed from a training set of images. The parametric component is a generative adversarial network. Given a new text description at inference time, the memory bank is used to selectively retrieve image features that are provided as basic information of target images, which enables the generator to produce realistic synthetic results. We also incorporate the content information into the discriminator, together with semantic features, allowing the discriminator to make a more reliable prediction. Experimental results demonstrate that the proposed memory-driven semi-parametric approach produces more realistic images than purely parametric approaches, in terms of both visual fidelity and text-image semantic consistency.

下载PDF全文

下载文献需遵守相关版权规定

论文标题