Facial action unit detection with emotion consistency: a cross-modal learning approach

被引：0

作者：

Song, Wenyu ^{[1
]}

Liu, Dongxin ^{[1
]}

An, Gaoyun ^{[2
,3
]}

Duan, Yun ^{[1
]}

Wang, Laifu ^{[1
]}

机构：

[1] China Telecom Res Inst, Guangzhou, Peoples R China

[2] Beijing Jiaotong Univ, Inst Informat Sci, Beijing, Peoples R China

[3] Beijing Key Lab Adv Informat Sci & Network Technol, Beijing, Peoples R China

来源：

MULTIMEDIA SYSTEMS | 2024年 / 30卷 / 06期

关键词：

Facial Action Unit; Emotional expression consistency; Cross-modal learning; Multi-task learning;

D O I：

10.1007/s00530-024-01552-0

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Facial Action Unit (AU) detection is essential for understanding emotional expressions. This study explores the intricate relationship among AUs, AU descriptions, and facial expressions, emphasizing emotional expression consistency. AUs represent specific facial muscle movements that form the basis of expressions, thus maintaining a solid physiological foundation is crucial for understanding emotional communication. Moreover, AU descriptions serve as linguistic representations and semantic alignment with expressions is paramount. Therefore, the vocabulary in AU descriptions must precisely reflect expression features to ensure coherence between textual and visual cues. Our method, AUTr-emo, employs cross-modal learning, incorporating AU text descriptions as queries and using facial expression recognition as an auxiliary task. This approach highlights the importance of emotional expression consistency across AUs, textual descriptions, and expressions. Extensive experiments are conducted on two challenging datasets, BP4D and DISFA, and experimental results show that our proposed AUTr-emo achieves performance comparable to the state-of-the-art in the field of AU detection.

引用

页数：13

共 50 条

[31] CrossFormer: Cross-Modal Representation Learning via Heterogeneous Graph Transformer
Liang, Xiao
Yang, Erkun
Deng, Cheng
Yang, Yanhua
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (12)
[32] Cross-modal interaction between visual and olfactory learning in Apis cerana
Li-Zhen Zhang
Shao-Wu Zhang
Zi-Long Wang
Wei-Yu Yan
Zhi-Jiang Zeng
Journal of Comparative Physiology A, 2014, 200 : 899 - 909
[33] Oracle Character Recognition Based on Cross-Modal Deep Metric Learning
Zhang Y.-K.
Zhang H.
Liu Y.-G.
Liu C.-L.
Zidonghua Xuebao/Acta Automatica Sinica, 2021, 47 (04): : 791 - 800
[34] The Visual Advantage Effect in Comparing Uni-Modal and Cross-Modal Probabilistic Category Learning
Sun, Xunwei
Fu, Qiufang
JOURNAL OF INTELLIGENCE, 2023, 11 (12)
[35] Facial Action Unit Detection with Multilayer Fused Multi-Task and Multi-Label Deep Learning Network
He, Jun
Li, Dongliang
Bo, Sun
Yu, Lejun
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2019, 13 (11) : 5546 - 5559
[36] Self-Supervised Intra-Modal and Cross-Modal Contrastive Learning for Point Cloud Understanding
Wu, Yue
Liu, Jiaming
Gong, Maoguo
Gong, Peiran
Fan, Xiaolong
Qin, A. K.
Miao, Qiguang
Ma, Wenping
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 1626 - 1638
[37] Adversarial-Metric Learning for Audio-Visual Cross-Modal Matching
Zheng, Aihua
Hu, Menglan
Jiang, Bo
Huang, Yan
Yan, Yan
Luo, Bin
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 338 - 351
[38] Lifelong Visual-Tactile Cross-Modal Learning for Robotic Material Perception
Zheng, Wendong
Liu, Huaping
Sun, Fuchun
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (03) : 1192 - 1203
[39] Vulnerability vs. Reliability: Disentangled Adversarial Examples for Cross-Modal Learning
Li, Chao
Tang, Haoteng
Deng, Cheng
Zhan, Liang
Liu, Wei
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 421 - 429
[40] Lack of Cross-Modal Effects in Dual-Modality Implicit Statistical Learning
Li, Xiujun
Zhao, Xudong
Shi, Wendian
Lu, Yang
Conway, Christopher M.
FRONTIERS IN PSYCHOLOGY, 2018, 9

← 1 2 3 4 5 →