Cross-Domain Facial Expression Recognition via Contrastive Warm up and Complexity-Aware Self-Training

被引:3
作者
Li, Yingjian [1 ,2 ]
Huang, Jiaxing [3 ]
Lu, Shijian [3 ]
Zhang, Zheng [1 ,4 ]
Lu, Guangming [4 ,5 ]
机构
[1] Peng Cheng Lab, Shenzhen 518055, Peoples R China
[2] Harbin Inst Technol, Shenzhen 518055, Peoples R China
[3] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore
[4] Harbin Inst Technol, Sch Comp Sci & Technol, Shenzhen 518055, Peoples R China
[5] Guangdong Prov Key Lab Novel Secur Intelligence Te, Shenzhen 518055, Peoples R China
关键词
Facial expression recognition; domain adaptation; contrastive learning; label selection; self-training; FEATURES;
D O I
10.1109/TIP.2023.3318955
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unsupervised cross-domain Facial Expression Recognition (FER) aims to transfer the knowledge from a labeled source domain to an unlabeled target domain. Existing methods strive to reduce the discrepancy between source and target domain, but cannot effectively explore the abundant semantic information of the target domain due to the absence of target labels. To this end, we propose a novel framework via Contrastive Warm up and Complexity-aware Self-Training (namely CWCST), which facilitates source knowledge transfer and target semantic learning jointly. Specifically, we formulate a contrastive warm up strategy via features, momentum features, and learnable category centers to concurrently learn discriminative representations and narrow the domain gap, which benefits domain adaptation by generating more accurate target pseudo labels. Moreover, to deal with the inevitable noise in pseudo labels, we develop complexity-aware self-training with a label selection module based on prediction entropy, which iteratively generates pseudo labels and adaptively chooses the reliable ones for training, ultimately yielding effective target semantics exploration. Furthermore, by jointly using the two mentioned components, our framework enables to effectively utilize the source knowledge and target semantic information by source-target co- training. In addition, our framework can be easily incorporated into other baselines with consistent performance improvements. Extensive experimental results on seven databases show the superior performance of the proposed method against various baselines.
引用
收藏
页码:5438 / 5450
页数:13
相关论文
共 63 条
[1]   Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Expression Recognition: History, Trends, and Affect-Related Applications [J].
Adrian Corneanu, Ciprian ;
Oliu Simon, Marc ;
Cohn, Jeffrey F. ;
Escalera Guerrero, Sergio .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (08) :1548-1568
[2]   Cross-Domain Facial Expression Recognition: A Unified Evaluation Benchmark and Adversarial Graph Learning [J].
Chen, Tianshui ;
Pu, Tao ;
Wu, Hefeng ;
Xie, Yuan ;
Liu, Lingbo ;
Lin, Liang .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) :9887-9903
[3]   CrDoCo: Pixel-level Domain Transfer with Cross-Domain Consistency [J].
Chen, Yun-Chun ;
Lin, Yen-Yu ;
Yang, Ming-Hsuan ;
Huang, Jia-Bin .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :1791-1800
[4]   Deep-Masking Generative Network: A Unified Framework for Background Restoration From Superimposed Images [J].
Feng, Xin ;
Pei, Wenjie ;
Jia, Zihui ;
Chen, Fanglin ;
Zhang, David ;
Lu, Guangming .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 (30) :4867-4882
[5]  
Ganin Y, 2016, J MACH LEARN RES, V17
[6]  
Ganin Y, 2015, PR MACH LEARN RES, V37, P1180
[7]  
Ghifary M, 2014, LECT NOTES ARTIF INT, V8862, P898, DOI 10.1007/978-3-319-13560-1_76
[8]  
Goodfellow Ian J., 2013, Neural Information Processing. 20th International Conference, ICONIP 2013. Proceedings: LNCS 8228, P117, DOI 10.1007/978-3-642-42051-1_16
[9]  
Gretton A, 2012, J MACH LEARN RES, V13, P723
[10]   Dynamic Facial Expression Recognition With Atlas Construction and Sparse Representation [J].
Guo, Yimo ;
Zhao, Guoying ;
Pietikainen, Matti .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (05) :1977-1992