Pose-Aware Facial Expression Recognition Assisted by Expression Descriptions

被引：5

作者：

Wang, Shangfei ^{[1
]}

Wu, Yi ^{[1
]}

Chang, Yanan ^{[1
]}

Li, Guoming ^{[2
]}

Mao, Meng ^{[2
]}

机构：

[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Key Lab Comp & Commun Software Anhui Provice, Hefei 230027, Anhui, Peoples R China

[2] China Merchants Bank, AI Lab, Shenzhen 518000, Guangdong, Peoples R China

来源：

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING | 2024年 / 15卷 / 01期

关键词：

Pose-aware; expression descriptions; facial expression recognition; cross-modality attention; JOINT POSE; MULTIVIEW;

D O I：

10.1109/TAFFC.2023.3267774

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Although expression descriptions provide additional information about facial behaviors despite of different poses, and pose features are beneficial to adapt to pose variety, neither has been fully leveraged in facial expression recognition. This paper proposes a pose-aware text-assisted facial expression recognition method using cross-modality attention. Specifically, the method contains three components. The pose feature extractor extracts pose-related features from facial images, and then cooperates with a fully-connected layer for pose classification. When poses can be clearly discriminated and classified, features obtained from the extractor can represent the corresponding poses. To eliminate bias due to appearance and illumination, cluster centers are taken as the final pose features. The text feature extractor obtains embeddings from expression descriptions. These descriptions are first passed through Intra-Exp attention to obtain preliminary embeddings. To leverage the correlations among expressions, all expression embeddings are then concatenated and passed through Inter-Exp attention. The cross-modality module attempts to learn attention maps that distinguish the importance of facial regions by using prior knowledge about poses and expression descriptions. The image features weighted by the attention maps are utilized to recognize pose and expression jointly. Experiments on three benchmark datasets demonstrate the superiority of the proposed method.

引用

页码：241 / 253

页数：13

共 31 条

[1]

Ba J, 2014, ACS SYM SER

[2]

Darwin C., 1872, P374

[3]

Ekman P., 2013, Emotion in the human face: Guidelines for research and an integration of findings, V11

[4] Discriminative Shared Gaussian Processes for Multiview and View-Invariant Facial Expression Recognition [J].

Eleftheriadis, Stefanos ;

Rudovic, Ognjen ;

Pantic, Maja .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (01) :189-204

[5] Facial Expression Recognition in the Wild via Deep Attentive Center Loss [J].

Farzaneh, Amir Hossein ;

Qi, Xiaojun .

2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, :2401-2410

[6] Multi-PIE [J].

Gross, Ralph ;

Matthews, Iain ;

Cohn, Jeffrey ;

Kanade, Takeo ;

Baker, Simon .

IMAGE AND VISION COMPUTING, 2010, 28 (05) :807-813

[7] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[8] Multiview Facial Expression Recognition, A Survey [J].

Jampour, Mahdi ;

Javidi, Malihe .

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (04) :2086-2105

[9] A Joint Mapping and Synthesis Approach for Multiview Facial Expression Recognition [J].

Jampour, Mahdi ;

Moin, Mohammad-Shahram .

INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2021, 35 (09)

[10] Pose-specific non-linear mapings in feature space towards multiview facial expression recognition [J].

Jampour, Mahdi ;

Lepetit, Vincent ;

Mauthner, Thomas ;

Bischof, Horst .

IMAGE AND VISION COMPUTING, 2017, 58 :38-46

← 1 2 3 4 →