Facial action unit detection with emotion consistency: a cross-modal learning approach

被引：0

作者：

Song, Wenyu ^{[1
]}

Liu, Dongxin ^{[1
]}

An, Gaoyun ^{[2
,3
]}

Duan, Yun ^{[1
]}

Wang, Laifu ^{[1
]}

机构：

[1] China Telecom Res Inst, Guangzhou, Peoples R China

[2] Beijing Jiaotong Univ, Inst Informat Sci, Beijing, Peoples R China

[3] Beijing Key Lab Adv Informat Sci & Network Technol, Beijing, Peoples R China

来源：

MULTIMEDIA SYSTEMS | 2024年 / 30卷 / 06期

关键词：

Facial Action Unit; Emotional expression consistency; Cross-modal learning; Multi-task learning;

D O I：

10.1007/s00530-024-01552-0

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Facial Action Unit (AU) detection is essential for understanding emotional expressions. This study explores the intricate relationship among AUs, AU descriptions, and facial expressions, emphasizing emotional expression consistency. AUs represent specific facial muscle movements that form the basis of expressions, thus maintaining a solid physiological foundation is crucial for understanding emotional communication. Moreover, AU descriptions serve as linguistic representations and semantic alignment with expressions is paramount. Therefore, the vocabulary in AU descriptions must precisely reflect expression features to ensure coherence between textual and visual cues. Our method, AUTr-emo, employs cross-modal learning, incorporating AU text descriptions as queries and using facial expression recognition as an auxiliary task. This approach highlights the importance of emotional expression consistency across AUs, textual descriptions, and expressions. Extensive experiments are conducted on two challenging datasets, BP4D and DISFA, and experimental results show that our proposed AUTr-emo achieves performance comparable to the state-of-the-art in the field of AU detection.

引用

页数：13

共 50 条

[41] Deep Latent Space Learning for Cross-modal Mapping of Audio and Visual Signals
Nawaz, Shah
Janjua, Muhammad Kamran
Gallo, Ignazio
Mahmood, Arif
Calefati, Alessandro
2019 DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2019, : 83 - 89
[42] Cross-Modal Learning Based Flexible Bimodal Biometric Authentication With Template Protection
Jiang, Qi
Zhao, Guichuan
Ma, Xindi
Li, Meng
Tian, Youliang
Li, Xinghua
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 3593 - 3607
[43] Unsupervised Cross-Modal Audio Representation Learning from Unstructured Multilingual Text
Schindler, Alexander
Gordea, Sergiu
Knees, Peter
PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), 2020, : 706 - 713
[44] Cross-Modal Supervision-Based Multitask Learning With Automotive Radar Raw Data
Jin, Yi
Deligiannis, Anastasios
Fuentes-Michel, Juan-Carlos
Vossiek, Martin
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (04): : 3012 - 3025
[45] Cross-modal learning using privileged information for long-tailed image classification
Li, Xiangxian
Zheng, Yuze
Ma, Haokai
Qi, Zhuang
Meng, Xiangxu
Meng, Lei
COMPUTATIONAL VISUAL MEDIA, 2024, 10 (05) : 981 - 992
[46] PointCMC: cross-modal multi-scale correspondences learning for point cloud understanding
Zhou, Honggu
Peng, Xiaogang
Luo, Yikai
Wu, Zizhao
MULTIMEDIA SYSTEMS, 2024, 30 (03)
[47] Cross-Modal Learning via Adversarial Loss and Covariate Shift for Enhanced Liver Segmentation
Ozkan, Savas
Selver, M. Alper
Baydar, Bora
Kavur, Ali Emre
Candemir, Cemre
Akar, Gozde Bozdagi
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (04): : 2723 - 2735
[48] A Cross-Modal transfer approach for histological images: A case study in aquaculture for disease identification using Zero-Shot learning
Mendieta, Milton
Romero, Dennis
2017 IEEE SECOND ECUADOR TECHNICAL CHAPTERS MEETING (ETCM), 2017,
[49] FROM INTRA-MODAL TO INTER-MODAL SPACE: MULTI-TASK LEARNING OF SHARED REPRESENTATIONS FOR CROSS-MODAL RETRIEVAL
Choi, Jaeyoung
Larson, Martha
Friedland, Gerald
Hanjalic, Alan
2019 IEEE FIFTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM 2019), 2019, : 1 - 10
[50] Cross-Modal Learning Based on Semantic Correlation and Multi-Task Learning for Text-Video Retrieval
Wu, Xiaoyu
Wang, Tiantian
Wang, Shengjin
ELECTRONICS, 2020, 9 (12) : 1 - 17

← 1 2 3 4 5 →