Cascaded Iterative Transformer for Jointly Predicting Facial Landmark, Occlusion Probability and Head Pose

被引:2
作者
Li, Yaokun [1 ]
Tan, Guang [1 ]
Gou, Chao [1 ]
机构
[1] Shenzhen Campus Sun Yat Sen Univ, Sch Intelligent Syst Engn, Shenzhen 518107, Peoples R China
基金
中国国家自然科学基金;
关键词
Facial landmark detection; Occlusion probability prediction; Head pose estimation; Multi-task learning; Coupling relationship; FACE ALIGNMENT; REGRESSION;
D O I
10.1007/s11263-023-01935-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Landmark detection under large pose with occlusion has been one of the challenging problems in the field of facial analysis. Recently, many works have predicted pose or occlusion together in the multi-task learning (MTL) paradigm, trying to tap into their dependencies and thus alleviate this issue. However, such implicit dependencies are weakly interpretable and inconsistent with the way humans exploit inter-task coupling relations, i.e., accommodating the induced explicit effects. This is one of the essentials that hinders their performance. To this end, in this paper, we propose a Cascaded Iterative Transformer (CIT) to jointly predict facial landmark, occlusion probability, and pose. The proposed CIT, besides implicitly mining task dependencies in a shared encoder, innovatively employs a cost-effective and portability-friendly strategy to pass the decoders' predictions as prior knowledge to human-like exploit the coupling-induced effects. Moreover, to the best of our knowledge, no dataset contains all these task annotations simultaneously, so we introduce a new dataset termed MERL-RAV-FLOP based on the MERL-RAV dataset. We conduct extensive experiments on several challenging datasets (300W-LP, AFLW2000-3D, BIWI, COFW, and MERL-RAV-FLOP) and achieve remarkable results. The code and dataset can be accessed in https://github.com/Iron-LYK/CIT.
引用
收藏
页码:1242 / 1257
页数:16
相关论文
共 66 条
[1]   A Data-Driven Approach to Improve 3D Head-Pose Estimation [J].
Aghli, Nima ;
Ribeiro, Eraldo .
ADVANCES IN VISUAL COMPUTING (ISVC 2021), PT I, 2021, 13017 :546-558
[2]   img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation [J].
Albiero, Vitor ;
Chen, Xingyu ;
Yin, Xi ;
Pang, Guan ;
Hassner, Tal .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :7613-7623
[3]   Faster Than Real-time Facial Alignment: A 3D Spatial Transformer Network Approach in Unconstrained Poses [J].
Bhagavatula, Chandrasekhar ;
Zhu, Chenchen ;
Luu, Khoa ;
Savvides, Marios .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4000-4009
[4]   FASHE: A FrActal Based Strategy for Head Pose Estimation [J].
Bisogni, Carmen ;
Nappi, Michele ;
Pero, Chiara ;
Ricciardi, Stefano .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :3192-3203
[5]   How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks) [J].
Bulat, Adrian ;
Tzimiropoulos, Georgios .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :1021-1030
[6]   Robust face landmark estimation under occlusion [J].
Burgos-Artizzu, Xavier P. ;
Perona, Pietro ;
Dollar, Piotr .
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, :1513-1520
[7]   Face Alignment by Explicit Shape Regression [J].
Cao, Xudong ;
Wei, Yichen ;
Wen, Fang ;
Sun, Jian .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2014, 107 (02) :177-190
[8]   A Vector-based Representation to Enhance Head Pose Estimation [J].
Cao, Zhiwen ;
Chu, Zongcheng ;
Liu, Dongfang ;
Chen, Yingjie .
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, :1187-1196
[9]   RetinaFace: Single-shot Multi-level Face Localisation in the Wild [J].
Deng, Jiankang ;
Guo, Jia ;
Ververas, Evangelos ;
Kotsia, Irene ;
Zafeiriou, Stefanos .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :5202-5211
[10]   RepVGG: Making VGG-style ConvNets Great Again [J].
Ding, Xiaohan ;
Zhang, Xiangyu ;
Ma, Ningning ;
Han, Jungong ;
Ding, Guiguang ;
Sun, Jian .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13728-13737