CTCNET：面部图像超分辨率的CNN转换器合作网络

论文标题

CTCNET：面部图像超分辨率的CNN转换器合作网络

CTCNet: A CNN-Transformer Cooperation Network for Face Image Super-Resolution

论文作者

Gao, Guangwei, Xu, Zixiang, Li, Juncheng, Yang, Jian, Zeng, Tieyong, Qi, Guo-Jun

论文摘要

最近，深层卷积神经网络（CNN）转向的面部超分辨率方法通过与面部先验的共同培训来恢复降级面部细节方面取得了巨大进展。但是，这些方法有一些明显的局限性。一方面，多任务关节学习需要在数据集上进行其他标记，并且引入的先前网络将大大增加模型的计算成本。另一方面，有限的CNN接受场将减少重建的面部图像的忠诚度和自然性，从而导致次优的重建图像。在这项工作中，我们为面部超分辨率任务提出了一个有效的CNN转换器合作网络（CTCNET），该任务使用多尺度连接的编码编码器架构作为骨干。具体而言，我们首先设计了一个新型的本地全球特征合作模块（LGCM），该模块由面部结构注意单元（FSAU）和变压器块组成，以同时促进局部面部细节和全球面部结构恢复的一致性。然后，我们设计了一个有效的功能改进模块（FRM），以增强编码的功能。最后，为了进一步改善精细面部细节的恢复，我们提出了一个多尺度功能融合单元（MFFU），以适应编码器过程中不同阶段的特征融合。对各种数据集进行了广泛的评估，已评估提出的CTCNET可以显着超过其他最先进的方法。源代码将在https://github.com/iviplab/ctcnet上找到。

Recently, deep convolution neural networks (CNNs) steered face super-resolution methods have achieved great progress in restoring degraded facial details by jointly training with facial priors. However, these methods have some obvious limitations. On the one hand, multi-task joint learning requires additional marking on the dataset, and the introduced prior network will significantly increase the computational cost of the model. On the other hand, the limited receptive field of CNN will reduce the fidelity and naturalness of the reconstructed facial images, resulting in suboptimal reconstructed images. In this work, we propose an efficient CNN-Transformer Cooperation Network (CTCNet) for face super-resolution tasks, which uses the multi-scale connected encoder-decoder architecture as the backbone. Specifically, we first devise a novel Local-Global Feature Cooperation Module (LGCM), which is composed of a Facial Structure Attention Unit (FSAU) and a Transformer block, to promote the consistency of local facial detail and global facial structure restoration simultaneously. Then, we design an efficient Feature Refinement Module (FRM) to enhance the encoded features. Finally, to further improve the restoration of fine facial details, we present a Multi-scale Feature Fusion Unit (MFFU) to adaptively fuse the features from different stages in the encoder procedure. Extensive evaluations on various datasets have assessed that the proposed CTCNet can outperform other state-of-the-art methods significantly. Source code will be available at https://github.com/IVIPLab/CTCNet.

下载PDF全文

下载文献需遵守相关版权规定

论文标题