注意力有助于CNN看到更好：基于注意力的混合图像质量评估网络

论文标题

注意力有助于CNN看到更好：基于注意力的混合图像质量评估网络

Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network

论文作者

Lao, Shanshan, Gong, Yuan, Shi, Shuwei, Yang, Sidi, Wu, Tianhe, Wang, Jiahao, Xia, Weihao, Yang, Yujiu

论文摘要

图像质量评估（IQA）算法旨在量化人类对图像质量的看法。不幸的是，在评估具有看似逼真的纹理的生成对抗网络（GAN）产生的失真图像时，性能下降。在这项工作中，我们猜想这种疾病不足在IQA模型的骨干上，其中贴片级预测方法使用独立的图像贴剂作为输入来分别计算其得分，但缺乏图像贴片之间的空间关系模型。因此，我们提出了一个基于注意力的混合图像质量评估网络（AHIQ），以应对挑战并在基于GAN的IQA任务上获得更好的绩效。首先，我们采用了两个分支的架构，包括视觉变压器（VIT）分支和卷积神经网络（CNN）分支进行特征提取。混合体系结构将互动信息结合在一起，在VIT捕获的图像补丁和CNN的本地纹理细节之间。为了使来自浅CNN的特征更集中在视觉上显着区域，在VIT分支的语义信息的帮助下应用了可变形的卷积。最后，我们使用斑块得分预测模块来获得最终分数。实验表明，我们的模型在四个标准IQA数据集上的最新方法优于最新方法，而AHIQ在NTIRE 2022 2022感知图像质量评估挑战的完整参考（FR）轨道上排名第一。

Image quality assessment (IQA) algorithm aims to quantify the human perception of image quality. Unfortunately, there is a performance drop when assessing the distortion images generated by generative adversarial network (GAN) with seemingly realistic texture. In this work, we conjecture that this maladaptation lies in the backbone of IQA models, where patch-level prediction methods use independent image patches as input to calculate their scores separately, but lack spatial relationship modeling among image patches. Therefore, we propose an Attention-based Hybrid Image Quality Assessment Network (AHIQ) to deal with the challenge and get better performance on the GAN-based IQA task. Firstly, we adopt a two-branch architecture, including a vision transformer (ViT) branch and a convolutional neural network (CNN) branch for feature extraction. The hybrid architecture combines interaction information among image patches captured by ViT and local texture details from CNN. To make the features from shallow CNN more focused on the visually salient region, a deformable convolution is applied with the help of semantic information from the ViT branch. Finally, we use a patch-wise score prediction module to obtain the final score. The experiments show that our model outperforms the state-of-the-art methods on four standard IQA datasets and AHIQ ranked first on the Full Reference (FR) track of the NTIRE 2022 Perceptual Image Quality Assessment Challenge.

下载PDF全文

下载文献需遵守相关版权规定

论文标题