多尺度低分辨率面的基于超分辨率的面部表达识别

论文标题

多尺度低分辨率面的基于超分辨率的面部表达识别

Feature Super-Resolution Based Facial Expression Recognition for Multi-scale Low-Resolution Faces

论文作者

Jing, Wei, Tian, Feng, Zhang, Jizhong, Chao, Kuo-Ming, Hong, Zhenxin, Liu, Xu

论文摘要

对于在人群场景（车站，教室等）中的群体表达识别等应用中，必须使用低分辨率图像上的面部表情识别（FER）。将小尺寸的面部图像分类为正确的表达类别仍然是一项具有挑战性的任务。该问题的主要原因是由于分辨率降低而导致判别特征的丧失。超分辨率方法通常用于增强低分辨率图像，但是在分辨率非常低的图像上，FER任务的性能受到限制。在这项工作中，受到对象检测的特征超分辨率方法的启发，我们提出了一种基于鲁棒面部表达识别（FSR-FER）的新型生成对手网络的特征水平超分辨率方法。特别是，采用了预训练的FER模型作为特征提取器，并使用从低分辨率和原始高分辨率的图像中提取的特征来训练发电机网络G和歧视器网络D。发电机网络G试图通过使其更接近相应的高分辨率图像的特征将低分辨率图像的特征转换为更具歧视性的图像。为了获得更好的分类性能，我们还提出了一种有效的分类 - 感知损失重新加权策略，该策略基于由固定FER模型计算出的分类概率，以使我们的模型更多地关注容易被错误分类的样品。现实世界情感面孔（RAF）数据库的实验结果表明，与使用图像超分辨率和表达式识别的方法相比，使用单个模型在各种模型上实现了各种下样本因子的满足结果，并且在低分辨率图像上具有更好的性能。

Facial Expressions Recognition(FER) on low-resolution images is necessary for applications like group expression recognition in crowd scenarios(station, classroom etc.). Classifying a small size facial image into the right expression category is still a challenging task. The main cause of this problem is the loss of discriminative feature due to reduced resolution. Super-resolution method is often used to enhance low-resolution images, but the performance on FER task is limited when on images of very low resolution. In this work, inspired by feature super-resolution methods for object detection, we proposed a novel generative adversary network-based feature level super-resolution method for robust facial expression recognition(FSR-FER). In particular, a pre-trained FER model was employed as feature extractor, and a generator network G and a discriminator network D are trained with features extracted from images of low resolution and original high resolution. Generator network G tries to transform features of low-resolution images to more discriminative ones by making them closer to the ones of corresponding high-resolution images. For better classification performance, we also proposed an effective classification-aware loss re-weighting strategy based on the classification probability calculated by a fixed FER model to make our model focus more on samples that are easily misclassified. Experiment results on Real-World Affective Faces (RAF) Database demonstrate that our method achieves satisfying results on various down-sample factors with a single model and has better performance on low-resolution images compared with methods using image super-resolution and expression recognition separately.

下载PDF全文

下载文献需遵守相关版权规定

论文标题