Facial action unit detection with emotion consistency: a cross-modal learning approach

被引:0
|
作者
Song, Wenyu [1 ]
Liu, Dongxin [1 ]
An, Gaoyun [2 ,3 ]
Duan, Yun [1 ]
Wang, Laifu [1 ]
机构
[1] China Telecom Res Inst, Guangzhou, Peoples R China
[2] Beijing Jiaotong Univ, Inst Informat Sci, Beijing, Peoples R China
[3] Beijing Key Lab Adv Informat Sci & Network Technol, Beijing, Peoples R China
关键词
Facial Action Unit; Emotional expression consistency; Cross-modal learning; Multi-task learning;
D O I
10.1007/s00530-024-01552-0
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Facial Action Unit (AU) detection is essential for understanding emotional expressions. This study explores the intricate relationship among AUs, AU descriptions, and facial expressions, emphasizing emotional expression consistency. AUs represent specific facial muscle movements that form the basis of expressions, thus maintaining a solid physiological foundation is crucial for understanding emotional communication. Moreover, AU descriptions serve as linguistic representations and semantic alignment with expressions is paramount. Therefore, the vocabulary in AU descriptions must precisely reflect expression features to ensure coherence between textual and visual cues. Our method, AUTr-emo, employs cross-modal learning, incorporating AU text descriptions as queries and using facial expression recognition as an auxiliary task. This approach highlights the importance of emotional expression consistency across AUs, textual descriptions, and expressions. Extensive experiments are conducted on two challenging datasets, BP4D and DISFA, and experimental results show that our proposed AUTr-emo achieves performance comparable to the state-of-the-art in the field of AU detection.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Hierarchical Cross-Modal Graph Consistency Learning for Video-Text Retrieval
    Jin, Weike
    Zhao, Zhou
    Zhang, Pengcheng
    Zhu, Jieming
    He, Xiuqiang
    Zhuang, Yueting
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1114 - 1124
  • [2] MULTI-TASK LEARNING OF EMOTION RECOGNITION AND FACIAL ACTION UNIT DETECTION WITH ADAPTIVELY WEIGHTS SHARING NETWORK
    Wang, Chu
    Zeng, Jiabei
    Shan, Shiguang
    Chen, Xilin
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 56 - 60
  • [3] Cross-Modal Learning for Anomaly Detection in Complex Industrial Process: Methodology and Benchmark
    Wu, Gaochang
    Zhang, Yapeng
    Deng, Lan
    Zhang, Jingxin
    Chai, Tianyou
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (03) : 2632 - 2645
  • [4] Uncertainty-Guided Cross-Modal Learning for Robust Multispectral Pedestrian Detection
    Kim, Jung Uk
    Park, Sungjune
    Ro, Yong Man
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (03) : 1510 - 1523
  • [5] Multispectral Object Detection via Cross-Modal Conflict-Aware Learning
    He, Xiao
    Tang, Chang
    Zou, Xin
    Zhang, Wei
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 1465 - 1474
  • [6] Pursuing Knowledge Consistency: Supervised Hierarchical Contrastive Learning for Facial Action Unit Recognition
    Chen, Yingjie
    Chen, Chong
    Luo, Xiao
    Huang, Jianqiang
    Hua, Xian-Sheng
    Wang, Tao
    Liang, Yun
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,
  • [7] Cross-modal learning with multi-modal model for video action recognition based on adaptive weight training
    Zhou, Qingguo
    Hou, Yufeng
    Zhou, Rui
    Li, Yan
    Wang, Jinqiang
    Wu, Zhen
    Li, Hung-Wei
    Weng, Tien-Hsiung
    CONNECTION SCIENCE, 2024, 36 (01)
  • [8] Exploiting Cross-Modal Prediction and Relation Consistency for Semisupervised Image Captioning
    Yang, Yang
    Wei, Hongchen
    Zhu, Hengshu
    Yu, Dianhai
    Xiong, Hui
    Yang, Jian
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (02) : 890 - 902
  • [9] Learning deep representation for action unit detection with auxiliary facial attributes
    Zhou, Caixia
    Zhi, Ruicong
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2022, 13 (02) : 407 - 419
  • [10] Learning deep representation for action unit detection with auxiliary facial attributes
    Caixia Zhou
    Ruicong Zhi
    International Journal of Machine Learning and Cybernetics, 2022, 13 : 407 - 419