Multiclass classification based on a deep convolutional network for head pose estimation

被引：14

作者：

Cai, Ying ^{[1
,2
]}

Yang, Meng-long ^{[3
]}

Li, Jun ^{[2
]}

机构：

[1] Sichuan Univ, Sch Comp Sci, Chengdu 610065, Peoples R China

[2] Sichuan Agr Univ, Coll Informat Engn, Yaan 625014, Peoples R China

[3] Sichuan Univ, Sch Aeronaut & Astronaut, Chengdu 610065, Peoples R China

来源：

FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING | 2015年 / 16卷 / 11期

基金：

中国国家自然科学基金;

关键词：

Head pose estimation; Deep convolutional neural network; Multiclass classification; RECOGNITION; POINT;

D O I：

10.1631/FITEE.1500125

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Head pose estimation has been considered an important and challenging task in computer vision. In this paper we propose a novel method to estimate head pose based on a deep convolutional neural network (DCNN) for 2D face images. We design an effective and simple method to roughly crop the face from the input image, maintaining the individual-relative facial features ratio. The method can be used in various poses. Then two convolutional neural networks are set up to train the head pose classifier and then compared with each other. The simpler one has six layers. It performs well on seven yaw poses but is somewhat unsatisfactory when mixed in two pitch poses. The other has eight layers and more pixels in input layers. It has better performance on more poses and more training samples. Before training the network, two reasonable strategies including shift and zoom are executed to prepare training samples. Finally, feature extraction filters are optimized together with the weight of the classification component through training, to minimize the classification error. Our method has been evaluated on the CAS-PEAL-R1, CMU PIE, and CUBIC FacePix databases. It has better performance than state-of-the-art methods for head pose estimation.

引用

页码：930 / 939

页数：10

共 28 条

[1] [Anonymous], 2003, HDB BRAIN THEORY NEU
[2] Black JA, 2002, P SOC PHOTO-OPT INS, V4862, P163, DOI 10.1117/12.473032
[3] Chen Huang, 2010, Proceedings of the 2010 20th International Conference on Pattern Recognition (ICPR 2010), P934, DOI 10.1109/ICPR.2010.234
[4] Ciresan D, 2012, PROC CVPR IEEE, P3642, DOI 10.1109/CVPR.2012.6248110
[5] Learning Hierarchical Features for Scene Labeling
Farabet, Clement
Couprie, Camille
Najman, Laurent
LeCun, Yann
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) : 1915 - 1929
[6] Fu Y., 2006, 7 INT C AUTOMATIC FA, P1, DOI [10.1109/FGR.2006.60, DOI 10.1109/FGR.2006.60]
[7] NEOCOGNITRON - A SELF-ORGANIZING NEURAL NETWORK MODEL FOR A MECHANISM OF PATTERN-RECOGNITION UNAFFECTED BY SHIFT IN POSITION
FUKUSHIMA, K
[J]. BIOLOGICAL CYBERNETICS, 1980, 36 (04) : 193 - 202
[8] The CAS-PEAL large-scale Chinese face database and baseline evaluations
Gao, Wen
Cao, Bo
Shan, Shiguang
Chen, Xilin
Zhou, Delong
Zhang, Xiaohua
Zhao, Debin
[J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2008, 38 (01): : 149 - 161
[9] What is the Best Multi-Stage Architecture for Object Recognition?
Jarrett, Kevin
Kavukcuoglu, Koray
Ranzato, Marc'Aurelio
LeCun, Yann
[J]. 2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, : 2146 - 2153
[10] ImageNet Classification with Deep Convolutional Neural Networks
Krizhevsky, Alex
Sutskever, Ilya
Hinton, Geoffrey E.
[J]. COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90

← 1 2 3 →