Learning toward practical head pose estimation

被引：4

作者：

Sang, Gaoli ^{[1
]}

He, Feixiang ^{[2
]}

Zhu, Rong ^{[1
]}

Xuan, Shibin ^{[3
]}

机构：

[1] Jiaxing Univ, Dept Math & Informat Engn, Jiaxing, Peoples R China

[2] Sichuan Univ, Coll Comp Sci, Chengdu, Sichuan, Peoples R China

[3] Guangxi Univ Nationalities, Coll Informat Sci & Engn, Nanning, Peoples R China

来源：

OPTICAL ENGINEERING | 2017年 / 56卷 / 08期

基金：

美国国家科学基金会;

关键词：

head pose estimation; learning; deep convolutional neural networks; multivariate labeling distributions; face recognition; 3D;

D O I：

10.1117/1.OE.56.8.083104

中图分类号：

O43 [光学];

学科分类号：

070207 ; 0803 ;

摘要：

Head pose is useful information for many face-related tasks, such as face recognition, behavior analysis, human-computer interfaces, etc. Existing head pose estimation methods usually assume that the face images have been well aligned or that sufficient and precise training data are available. In practical applications, however, these assumptions are very likely to be invalid. This paper first investigates the impact of the failure of these assumptions, i.e., misalignment of face images, uncertainty and undersampling of training data, on head pose estimation accuracy of state-of-the-art methods. A learning-based approach is then designed to enhance the robustness of head pose estimation to these factors. To cope with misalignment, instead of using handcrafted features, it seeks suitable features by learning from a set of training data with a deep convolutional neural network (DCNN), such that the training data can be best classified into the correct head pose categories. To handle uncertainty and undersampling, it employs multivariate labeling distributions (MLDs) with dense sampling intervals to represent the head pose attributes of face images. The correlation between the features and the dense MLD representations of face images is approximated by a maximum entropy model, whose parameters are optimized on the given training data. To estimate the head pose of a face image, its MLD representation is first computed according to the model based on the features extracted from the image by the trained DCNN, and its head pose is then assumed to be the one corresponding to the peak in its MLD. Evaluation experiments on the Pointing' 04, FacePix, Multi-PIE, and CASIA-PEAL databases prove the effectiveness and efficiency of the proposed method. (C) 2017 Society of Photo-Optical Instrumentation Engineers (SPIE)

引用

页数：11

共 41 条

[31] Riegler G., 2014, BMVC, P1, DOI 10.5244/c.28.66
[32] Sang G., 2016, COMPUT INTEL NEUROSC, V2016, P13
[33] Stiefelhagen Rainer, 2004, POINTING04 ICPR WORK
[34] Head-Pose Estimation In-the-Wild Using a Random Forest
Valle, Roberto
Miguel Buenaposada, Jose
Valdes, Antonio
Baumela, Luis
[J]. ARTICULATED MOTION AND DEFORMABLE OBJECTS, 2016, 9756 : 24 - 33
[35] Supervised sparse manifold regression for head pose estimation in 3D space
Wang, Qicong
Wu, Yuxiang
Shen, Yehu
Liu, Yong
Lei, Yunqi
[J]. SIGNAL PROCESSING, 2015, 112 : 34 - 42
[36] Wang X. meng, 2016, P SOC PHOTO-OPT INS
[37] A two-stage head pose estimation framework and evaluation
Wu, Junwen
Trivedi, Mohan M.
[J]. PATTERN RECOGNITION, 2008, 41 (03) : 1138 - 1158
[38] A Facial Pose Estimation Algorithm Using Deep Learning
Xu, Xiao
Wu, Lifang
Wang, Ke
Ma, Yukun
Qi, Wei
[J]. BIOMETRIC RECOGNITION, CCBR 2015, 2015, 9428 : 669 - 676
[39] An Approach for Fast Human Head Pose Estimation
Yari, Yessenia
Scharcanski, Jacob
[J]. MOBILE MULTIMEDIA/IMAGE PROCESSING, SECURITY, AND APPLICATIONS 2011, 2011, 8063
[40] Zhang ZP, 2014, LECT NOTES COMPUT SC, V8694, P94, DOI 10.1007/978-3-319-10599-4_7

← 1 2 3 4 5 →