Facial action unit detection with emotion consistency: a cross-modal learning approach

被引：0

作者：

Song, Wenyu ^{[1
]}

Liu, Dongxin ^{[1
]}

An, Gaoyun ^{[2
,3
]}

Duan, Yun ^{[1
]}

Wang, Laifu ^{[1
]}

机构：

[1] China Telecom Res Inst, Guangzhou, Peoples R China

[2] Beijing Jiaotong Univ, Inst Informat Sci, Beijing, Peoples R China

[3] Beijing Key Lab Adv Informat Sci & Network Technol, Beijing, Peoples R China

来源：

MULTIMEDIA SYSTEMS | 2024年 / 30卷 / 06期

关键词：

Facial Action Unit; Emotional expression consistency; Cross-modal learning; Multi-task learning;

D O I：

10.1007/s00530-024-01552-0

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Facial Action Unit (AU) detection is essential for understanding emotional expressions. This study explores the intricate relationship among AUs, AU descriptions, and facial expressions, emphasizing emotional expression consistency. AUs represent specific facial muscle movements that form the basis of expressions, thus maintaining a solid physiological foundation is crucial for understanding emotional communication. Moreover, AU descriptions serve as linguistic representations and semantic alignment with expressions is paramount. Therefore, the vocabulary in AU descriptions must precisely reflect expression features to ensure coherence between textual and visual cues. Our method, AUTr-emo, employs cross-modal learning, incorporating AU text descriptions as queries and using facial expression recognition as an auxiliary task. This approach highlights the importance of emotional expression consistency across AUs, textual descriptions, and expressions. Extensive experiments are conducted on two challenging datasets, BP4D and DISFA, and experimental results show that our proposed AUTr-emo achieves performance comparable to the state-of-the-art in the field of AU detection.

引用

页数：13

共 50 条

[21] BadCM: Invisible Backdoor Attack Against Cross-Modal Learning
Zhang, Zheng
Yuan, Xu
Zhu, Lei
Song, Jingkuan
Nie, Liqiang
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 2558 - 2571
[22] Multisensory integration and cross-modal learning in synaesthesia: A unifying model
Newell, Fiona N.
Mitchell, Kevin J.
NEUROPSYCHOLOGIA, 2016, 88 : 140 - 150
[23] Cross-modal & Cross-domain Learning for Unsupervised LiDAR Semantic Segmentation
Chen, Yiyang
Zhao, Shanshan
Ding, Changxing
Tang, Liyao
Wang, Chaoyue
Tao, Dacheng
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3866 - 3875
[24] Heterogeneous spatio-temporal relation learning network for facial action unit detection
Song, Wenyu
Shi, Shuze
Dong, Yu
An, Gaoyun
PATTERN RECOGNITION LETTERS, 2022, 164 : 268 - 275
[25] Cross-modal interaction between visual and olfactory learning in Apis cerana
Zhang, Li-Zhen
Zhang, Shao-Wu
Wang, Zi-Long
Yan, Wei-Yu
Zeng, Zhi-Jiang
JOURNAL OF COMPARATIVE PHYSIOLOGY A-NEUROETHOLOGY SENSORY NEURAL AND BEHAVIORAL PHYSIOLOGY, 2014, 200 (10): : 899 - 909
[26] Surface Material Retrieval Using Weakly Paired Cross-Modal Learning
Liu, Huaping
Wang, Feng
Sun, Fuchun
Fang, Bin
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2019, 16 (02) : 781 - 791
[27] Causality-Invariant Interactive Mining for Cross-Modal Similarity Learning
Yan, Jiexi
Deng, Cheng
Huang, Heng
Liu, Wei
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (09) : 6216 - 6230
[28] CCL: Cross-modal Correlation Learning With Multigrained Fusion by Hierarchical Network
Peng, Yuxin
Qi, Jinwei
Huang, Xin
Yuan, Yuxin
IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (02) : 405 - 420
[29] LEARNING VISUALLY ALIGNED SEMANTIC GRAPH FOR CROSS-MODAL MANIFOLD MATCHING
Li, Yanan
Hu, Huanhang
Wang, Donghui
2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 3412 - 3416
[30] Cross-Modal Simplex Center Learning for Speech-Face Association
Ma, Qiming
Bu, Fanliang
Wang, Rong
Bu, Lingbin
Wang, Yifan
Li, Zhiyuan
CMC-COMPUTERS MATERIALS & CONTINUA, 2025, 82 (03): : 5169 - 5184

← 1 2 3 4 5 →