论文标题
基于沃森的感知模型的生成神经网络的损失功能
A Loss Function for Generative Neural Networks Based on Watson's Perceptual Model
论文作者
论文摘要
要训练变异自动编码器(VAE)来产生逼真的图像,就需要一个反映人类对图像相似性的感知的损失函数。我们根据Watson的感知模型提出了这样的损失函数,该模型计算频率空间的加权距离,并说明了亮度和对比度掩蔽。我们将模型扩展到彩色图像,通过使用傅立叶变换来增加其对翻译的鲁棒性,由于将图像拆分为块而删除伪影,然后使其可区分。在实验中,通过新损失函数训练的VAE生成了逼真的高质量图像样本。与使用欧几里得距离和结构相似性指数相比,图像较少模糊。与基于神经网络的深层损失相比,新方法需要更少的计算资源和生成较少伪像的图像。
To train Variational Autoencoders (VAEs) to generate realistic imagery requires a loss function that reflects human perception of image similarity. We propose such a loss function based on Watson's perceptual model, which computes a weighted distance in frequency space and accounts for luminance and contrast masking. We extend the model to color images, increase its robustness to translation by using the Fourier Transform, remove artifacts due to splitting the image into blocks, and make it differentiable. In experiments, VAEs trained with the new loss function generated realistic, high-quality image samples. Compared to using the Euclidean distance and the Structural Similarity Index, the images were less blurry; compared to deep neural network based losses, the new approach required less computational resources and generated images with less artifacts.