级联级联网络具有平稳的预测视频面部表情识别

论文标题

级联级联网络具有平稳的预测视频面部表情识别

Coarse-to-Fine Cascaded Networks with Smooth Predicting for Video Facial Expression Recognition

论文作者

Xue, Fanglei, Tan, Zichang, Zhu, Yu, Ma, Zhongsong, Guo, Guodong

论文摘要

面部表达识别在人类计算机相互作用中起重要作用。在本文中，我们提出了具有平滑预测（CFC-SP）的粗到精细的级联网络，以提高面部表达识别的性能。 CFC-SP包含两个核心组件，即级联网络（CFC）和平滑预测（SP）。对于CFC而言，它首先将几种类似的情绪组成，以形成一个粗糙的类别，然后采用网络进行粗糙但准确的分类。后来，这些分组情绪的额外网络被进一步用于获得细粒度的预测。对于SP，它通过捕获通用和唯一的表达特征来提高模型的识别能力。具体来说，通用特征表示一段时期内面部情绪的一般特征，而独特的特征表示目前的特定特征。关于Aff-Wild2的实验显示了拟议CFSP的有效性。我们在第三次竞争的表达分类挑战中获得了第三名，该挑战在野外行为分析方面。该代码将在https://github.com/br-idl/paddlevit上发布。

Facial expression recognition plays an important role in human-computer interaction. In this paper, we propose the Coarse-to-Fine Cascaded network with Smooth Predicting (CFC-SP) to improve the performance of facial expression recognition. CFC-SP contains two core components, namely Coarse-to-Fine Cascaded networks (CFC) and Smooth Predicting (SP). For CFC, it first groups several similar emotions to form a rough category, and then employs a network to conduct a coarse but accurate classification. Later, an additional network for these grouped emotions is further used to obtain fine-grained predictions. For SP, it improves the recognition capability of the model by capturing both universal and unique expression features. To be specific, the universal features denote the general characteristic of facial emotions within a period and the unique features denote the specific characteristic at this moment. Experiments on Aff-Wild2 show the effectiveness of the proposed CFSP. We achieved 3rd place in the Expression Classification Challenge of the 3rd Competition on Affective Behavior Analysis in-the-wild. The code will be released at https://github.com/BR-IDL/PaddleViT.

下载PDF全文

下载文献需遵守相关版权规定

论文标题