MR-CapsNet: A Deep Learning Algorithm for Image-Based Head Pose Estimation on CapsNet

被引:3
作者
Fang, Hao [1 ,2 ,3 ]
Liu, Jun-Qing [4 ]
Xie, Kai [1 ,2 ,3 ]
Wu, Peng [1 ,2 ]
Zhang, Xin-Yu [1 ,2 ,3 ]
Wen, Chang [3 ,4 ]
He, Jian-Biao [5 ]
机构
[1] Yangtze Univ, Sch Elect Informat, Jingzhou 434023, Peoples R China
[2] Yangtze Univ, Natl Demonstrat Ctr Expt Elect & Elect Educ, Jingzhou 434023, Peoples R China
[3] Yangtze Univ, West Inst, Kelamayi 834000, Peoples R China
[4] Yangtze Univ, Sch Comp Sci, Jingzhou 434023, Peoples R China
[5] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Peoples R China
关键词
Head; Feature extraction; Face recognition; Pose estimation; Magnetic heads; Training; Task analysis; Head pose estimation; multi-stage regression; squeeze-and-excitation block; capsule network;
D O I
10.1109/ACCESS.2021.3119615
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Head pose estimation based on a single image is a challenging endeavor because of the complex background conditions and characteristics of the human face. In this report, we propose a Multi stage Regression-Capsule Network (MR-CapsNet) to predict head posture based on a single image input. In the study, we used the residual attention block and squeeze-and-excitation block to extract features in three levels. CapsNet overcomes the shortcomings of the traditional convolutional neural network and implements module aggregation to describe the spatial relationship of features after aggregation, in addition to realizing a compact and robust model using a multi-stage regression scheme. We tested our method on the AFLW2000 and BIWI datasets obtaining mean absolute errors of 4.26% and 3.95%, respectively. In addition, we discuss the accuracy of our method in the case of eye or mouth occlusion. The results of comprehensive experiments reveal that our method can accurately predict head posture.
引用
收藏
页码:141245 / 141257
页数:13
相关论文
共 45 条
  • [1] Head pose estimation by regression algorithm
    Abate, Andrea F.
    Barra, Paola
    Pero, Chiara
    Tucci, Maurizio
    [J]. PATTERN RECOGNITION LETTERS, 2020, 140 : 179 - 185
  • [2] Belhumeur PN, 2011, PROC CVPR IEEE, P545, DOI 10.1109/CVPR.2011.5995602
  • [3] How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks)
    Bulat, Adrian
    Tzimiropoulos, Georgios
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1021 - 1030
  • [4] Chamveha I., 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), P1713, DOI 10.1109/ICCVW.2011.6130456
  • [5] FacePoseNet: Making a Case for Landmark-Free Face Alignment
    Chang, Feng-Ju
    Anh Tuan Tran
    Hassner, Tal
    Masi, Iacopo
    Nevatia, Ram
    Medioni, Gerard
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 1599 - 1608
  • [6] Chen D, 2014, LECT NOTES COMPUT SC, V8694, P109, DOI 10.1007/978-3-319-10599-4_8
  • [7] Chih-Wei Chen, 2011, Proceedings 2011 IEEE International Conference on Automatic Face & Gesture Recognition (FG 2011), P933, DOI 10.1109/FG.2011.5771376
  • [8] ACTIVE SHAPE MODELS - THEIR TRAINING AND APPLICATION
    COOTES, TF
    TAYLOR, CJ
    COOPER, DH
    GRAHAM, J
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 1995, 61 (01) : 38 - 59
  • [9] Continuous Head Pose Estimation Using Manifold Subspace Embedding and Multivariate Regression
    Diaz-chito, Katerine
    Del Rincon, Jesus Martinez
    Hernandez-Sabate, Aura
    Gil, Debora
    [J]. IEEE ACCESS, 2018, 6 : 18325 - 18334
  • [10] Robust Head-Pose Estimation Based on Partially-Latent Mixture of Linear Regressions
    Drouard, Vincent
    Horaud, Radu
    Deleforge, Antoine
    Ba, Sileye
    Evangelidis, Georgios
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (03) : 1428 - 1440