Cross-Layer Contrastive Learning of Latent Semantics for Facial Expression Recognition

被引：3

作者：

Xie, Weicheng ^{[1
]}

Peng, Zhibin ^{[1
]}

Shen, Linlin ^{[1
]}

Lu, Wenya ^{[1
]}

Zhang, Yang ^{[1
]}

Song, Siyang ^{[2
]}

机构：

[1] Shenzhen Univ, Shenzhen Inst Artificial Intelligence, Sch Comp Sci & Software Engn, Comp Vis Inst,Guangdong Key Lab Intelligent Inform, Shenzhen 518060, Peoples R China

[2] Univ Cambridge, Dept Comp Sci & Technol, Cambridge CB2 1TN, England

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2024年 / 33卷

关键词：

Semantics; Cross layer design; Face recognition; Self-supervised learning; Representation learning; Faces; Task analysis; Facial expression recognition; contrastive learning; latent semantic alignment; multi-layer attention; NETWORK; ATTENTION;

D O I：

10.1109/TIP.2024.3378459

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Convolutional neural networks (CNNs) have achieved significant improvement for the task of facial expression recognition. However, current training still suffers from the inconsistent learning intensities among different layers, i.e., the feature representations in the shallow layers are not sufficiently learned compared with those in deep layers. To this end, this work proposes a contrastive learning framework to align the feature semantics of shallow and deep layers, followed by an attention module for representing the multi-scale features in the weight-adaptive manner. The proposed algorithm has three main merits. First, the learning intensity, defined as the magnitude of the backpropagation gradient, of the features on the shallow layer is enhanced by cross-layer contrastive learning. Second, the latent semantics in the shallow-layer and deep-layer features are explored and aligned in the contrastive learning, and thus the fine-grained characteristics of expressions can be taken into account for the feature representation learning. Third, by integrating the multi-scale features from multiple layers with an attention module, our algorithm achieved the state-of-the-art performances, i.e. 92.21%, 89.50%, 62.82%, on three in-the-wild expression databases, i.e. RAF-DB, FERPlus, SFEW, and the second best performance, i.e. 65.29% on AffectNet dataset. Our codes will be made publicly available.

引用

页码：2514 / 2529

页数：16

共 50 条

[31] Joint Deep Learning of Facial Expression Synthesis and Recognition
Yan, Yan
Huang, Ying
Chen, Si
Shen, Chunhua
Wang, Hanzi
IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (11) : 2792 - 2807
[32] Facial expression recognition using dual dictionary learning
Moeini, Ali
Faez, Karim
Moeini, Hossein
Safai, Armon Matthew
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2017, 45 : 20 - 33
[33] Deep Cross-Layer Collaborative Learning Network for Online Knowledge Distillation
Su, Tongtong
Liang, Qiyu
Zhang, Jinsong
Yu, Zhaoyang
Xu, Ziyue
Wang, Gang
Liu, Xiaoguang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (05) : 2075 - 2087
[34] Multimodal learning for facial expression recognition
Zhang, Wei
Zhang, Youmei
Ma, Lin
Guan, Jingwei
Gong, Shijie
PATTERN RECOGNITION, 2015, 48 (10) : 3191 - 3202
[35] Learning Dynamic Relationships for Facial Expression Recognition Based on Graph Convolutional Network
Jin, Xing
Lai, Zhihui
Jin, Zhong
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 (30) : 7143 - 7155
[36] DR-FER: Discriminative and Robust Representation Learning for Facial Expression Recognition
Li, Ming
Fu, Huazhu
He, Shengfeng
Fan, Hehe
Liu, Jun
Keppo, Jussi
Shou, Mike Zheng
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 6297 - 6309
[37] AU-Oriented Expression Decomposition Learning for Facial Expression Recognition
Lin, Zehao
She, Jiahui
Shen, Qiu
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT V, 2024, 14429 : 265 - 277
[38] Expression-Guided Deep Joint Learning for Facial Expression Recognition
Fang, Bei
Zhao, Yujie
Han, Guangxin
He, Juhou
SENSORS, 2023, 23 (16)
[39] Contrastive Learning of View-invariant Representations for Facial Expressions Recognition
Roy, Shuvendu
Etemad, Ali
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (04)
[40] Robust Transferable Subspace Learning for Cross-Corpus Facial Expression Recognition
Chen, Dongliang
Song, Peng
Zhang, Wenjing
Zhang, Weijian
Xu, Bingui
Zhou, Xuan
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (10): : 2241 - 2245

← 1 2 3 4 5 →