改善具有感知损失的图像自动编码器嵌入

论文标题

改善具有感知损失的图像自动编码器嵌入

Improving Image Autoencoder Embeddings with Perceptual Loss

论文作者

Pihlgren, Gustav Grund, Sandin, Fredrik, Liwicki, Marcus

论文摘要

自动编码器通常是使用元素损失训练的。但是，元素的损失无视图像中的高级结构，这可能导致嵌入也无视它们。最近对有助于减轻此问题的自动编码器的改进是使用感知损失。这项工作从编码器嵌入本身的角度研究了感知损失。使用基于验证的模型以及像素损失的感知损失，对自动编码器进行了培训，可以从三个不同的计算机视觉数据集中嵌入图像。培训了许多不同的预测指标，以在嵌入式图像作为输入的情况下在数据集上执行对象定位和分类。通过比较预测变量与来自不同训练的自动编码器的嵌入的预测因子的执行方式来评估两种损失。结果表明，在图像结构域中，由经过感知损失训练的自动编码器生成的嵌入可以比接受元素损失的训练的预测更准确。此外，结果表明，在小规模特征对象定位的任务上，感知损失可以提高结果10。实验设置可在线提供：https：//github.com/guspih/perceptual-autoencencoders

Autoencoders are commonly trained using element-wise loss. However, element-wise loss disregards high-level structures in the image which can lead to embeddings that disregard them as well. A recent improvement to autoencoders that helps alleviate this problem is the use of perceptual loss. This work investigates perceptual loss from the perspective of encoder embeddings themselves. Autoencoders are trained to embed images from three different computer vision datasets using perceptual loss based on a pretrained model as well as pixel-wise loss. A host of different predictors are trained to perform object positioning and classification on the datasets given the embedded images as input. The two kinds of losses are evaluated by comparing how the predictors performed with embeddings from the differently trained autoencoders. The results show that, in the image domain, the embeddings generated by autoencoders trained with perceptual loss enable more accurate predictions than those trained with element-wise loss. Furthermore, the results show that, on the task of object positioning of a small-scale feature, perceptual loss can improve the results by a factor 10. The experimental setup is available online: https://github.com/guspih/Perceptual-Autoencoders

下载PDF全文

下载文献需遵守相关版权规定

论文标题