MR-CapsNet: A Deep Learning Algorithm for Image-Based Head Pose Estimation on CapsNet

被引：4

作者：

Fang, Hao ^{[1
,2
,3
]}

Liu, Jun-Qing ^{[4
]}

Xie, Kai ^{[1
,2
,3
]}

Wu, Peng ^{[1
,2
]}

Zhang, Xin-Yu ^{[1
,2
,3
]}

Wen, Chang ^{[3
,4
]}

He, Jian-Biao ^{[5
]}

机构：

[1] Yangtze Univ, Sch Elect Informat, Jingzhou 434023, Peoples R China

[2] Yangtze Univ, Natl Demonstrat Ctr Expt Elect & Elect Educ, Jingzhou 434023, Peoples R China

[3] Yangtze Univ, West Inst, Kelamayi 834000, Peoples R China

[4] Yangtze Univ, Sch Comp Sci, Jingzhou 434023, Peoples R China

[5] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Peoples R China

来源：

IEEE ACCESS | 2021年 / 9卷

关键词：

Head; Feature extraction; Face recognition; Pose estimation; Magnetic heads; Training; Task analysis; Head pose estimation; multi-stage regression; squeeze-and-excitation block; capsule network;

D O I：

10.1109/ACCESS.2021.3119615

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Head pose estimation based on a single image is a challenging endeavor because of the complex background conditions and characteristics of the human face. In this report, we propose a Multi stage Regression-Capsule Network (MR-CapsNet) to predict head posture based on a single image input. In the study, we used the residual attention block and squeeze-and-excitation block to extract features in three levels. CapsNet overcomes the shortcomings of the traditional convolutional neural network and implements module aggregation to describe the spatial relationship of features after aggregation, in addition to realizing a compact and robust model using a multi-stage regression scheme. We tested our method on the AFLW2000 and BIWI datasets obtaining mean absolute errors of 4.26% and 3.95%, respectively. In addition, we discuss the accuracy of our method in the case of eye or mouth occlusion. The results of comprehensive experiments reveal that our method can accurately predict head posture.

引用

页码：141245 / 141257

页数：13

共 45 条

[1] Head pose estimation by regression algorithm [J].

Abate, Andrea F. ;

Barra, Paola ;

Pero, Chiara ;

Tucci, Maurizio .

PATTERN RECOGNITION LETTERS, 2020, 140 :179-185

[2]

Belhumeur PN, 2011, PROC CVPR IEEE, P545, DOI 10.1109/CVPR.2011.5995602

[3] How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks) [J].

Bulat, Adrian ;

Tzimiropoulos, Georgios .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :1021-1030

[4]

Chamveha I., 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), P1713, DOI 10.1109/ICCVW.2011.6130456

[5] FacePoseNet: Making a Case for Landmark-Free Face Alignment [J].

Chang, Feng-Ju ;

Anh Tuan Tran ;

Hassner, Tal ;

Masi, Iacopo ;

Nevatia, Ram ;

Medioni, Gerard .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, :1599-1608

[6]

Chen D, 2014, LECT NOTES COMPUT SC, V8694, P109, DOI 10.1007/978-3-319-10599-4_8

[7]

Chih-Wei Chen, 2011, Proceedings 2011 IEEE International Conference on Automatic Face & Gesture Recognition (FG 2011), P933, DOI 10.1109/FG.2011.5771376

[8] ACTIVE SHAPE MODELS - THEIR TRAINING AND APPLICATION [J].

COOTES, TF ;

TAYLOR, CJ ;

COOPER, DH ;

GRAHAM, J .

COMPUTER VISION AND IMAGE UNDERSTANDING, 1995, 61 (01) :38-59

[9] Continuous Head Pose Estimation Using Manifold Subspace Embedding and Multivariate Regression [J].

Diaz-chito, Katerine ;

Del Rincon, Jesus Martinez ;

Hernandez-Sabate, Aura ;

Gil, Debora .

IEEE ACCESS, 2018, 6 :18325-18334

[10] Robust Head-Pose Estimation Based on Partially-Latent Mixture of Linear Regressions [J].

Drouard, Vincent ;

Horaud, Radu ;

Deleforge, Antoine ;

Ba, Sileye ;

Evangelidis, Georgios .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (03) :1428-1440

← 1 2 3 4 5 →