Contrastive Learning of Person-Independent Representations for Facial Action Unit Detection

被引:6
|
作者
Li, Yong [1 ]
Shan, Shiguang [2 ,3 ,4 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Key Lab Intelligent Percept & Syst High Dimens Inf, Minist Educ, Nanjing 210094, Peoples R China
[2] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China
[3] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100049, Peoples R China
[4] Peng Cheng Lab, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
Gold; Videos; Training; Image reconstruction; Feature extraction; Faces; Task analysis; Facial action unit detection; contrastive Learning; self-supervised learning; person-independent action unit detection; NEURAL-NETWORKS;
D O I
10.1109/TIP.2023.3279978
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Facial action unit (AU) detection, aiming to classify AU present in the facial image, has long suffered from insufficient AU annotations. In this paper, we aim to mitigate this data scarcity issue by learning AU representations from a large number of unlabelled facial videos in a contrastive learning paradigm. We formulate the self-supervised AU representation learning signals in two-fold: 1) AU representation should be frame-wisely discriminative within a short video clip; 2) Facial frames sampled from different identities but show analogous facial AUs should have consistent AU representations. As to achieve these goals, we propose to contrastively learn the AU representation within a video clip and devise a cross-identity reconstruction mechanism to learn the person-independent representations. Specially, we adopt a margin-based temporal contrastive learning paradigm to perceive the temporal AU coherence and evolution characteristics within a clip that consists of consecutive input facial frames. Moreover, the cross-identity reconstruction mechanism facilitates pushing the faces from different identities but show analogous AUs close in the latent embedding space. Experimental results on three public AU datasets demonstrate that the learned AU representation is discriminative for AU detection. Our method outperforms other contrastive learning methods and significantly closes the performance gap between the self-supervised and supervised AU detection approaches.
引用
收藏
页码:3212 / 3225
页数:14
相关论文
共 50 条
  • [1] Emotion-aware Contrastive Learning for Facial Action Unit Detection
    Sun, Xuran
    Zeng, Jiabei
    Shan, Shiguang
    2021 16TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2021), 2021,
  • [2] Contrastive Adversarial Learning for Person Independent Facial Emotion Recognition
    Kim, Daeha
    Song, Byung Cheol
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 5948 - 5956
  • [3] CDRL: Contrastive Disentangled Representation Learning Scheme for Facial Action Unit Detection
    Zhao, Huijuan
    He, Shuangjiang
    Yu, Li
    Du, Congju
    Xiang, Jinqiao
    2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, : 652 - 659
  • [4] Contrastive Feature Learning and Class-Weighted Loss for Facial Action Unit Detection
    Wu, Bing-Fei
    Wei, Yin-Tse
    Wu, Bing-Jhang
    Lin, Chun-Hsien
    2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2019, : 2478 - 2483
  • [5] Affective interaction based on person-independent facial expression space
    Wang, Hao
    Wang, Kongqiao
    NEUROCOMPUTING, 2008, 71 (10-12) : 1889 - 1901
  • [6] Person-independent Facial Expression Recognition via Hierarchical Classification
    Xue, Mingliang
    Liu, Wanquan
    Li, Ling
    2013 IEEE EIGHTH INTERNATIONAL CONFERENCE ON INTELLIGENT SENSORS, SENSOR NETWORKS AND INFORMATION PROCESSING, 2013, : 449 - 454
  • [7] Online Action Detection with Learning Future Representations by Contrastive Learning
    Leng, Haitao
    Shi, Xiaoming
    Zhou, Wei
    Zhang, Kuncai
    Shi, Qiankun
    Zhu, Pengcheng
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2213 - 2218
  • [8] Video-based person-dependent and person-independent facial emotion recognition
    Noushin Hajarolasvadi
    Enver Bashirov
    Hasan Demirel
    Signal, Image and Video Processing, 2021, 15 : 1049 - 1056
  • [9] Person-Independent Facial Expression Recognition with Histograms of Prominent Edge Directions
    Makhmudkhujaev, Farkhod
    Bin Iqbal, Tauhid
    Arefin, Rifat
    Ryu, Byungyong
    Chae, Oksam
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2018, 12 (12): : 6000 - 6017
  • [10] Person-independent facial expression analysis by fusing multiscale cell features
    Zhou, Lubing
    Wang, Han
    OPTICAL ENGINEERING, 2013, 52 (03)