Three-Stream Head Pose Estimation Algorithm Based on Multi-Stage Feature Fusion

被引:1
作者
Han, Xue [1 ,2 ]
Zhang, Hongying [1 ,2 ]
Lu, Xiuwen [1 ,2 ]
Zhang, Qi [1 ,2 ]
机构
[1] School of Information Engineering, Southwest University of Science and Technology, Sichuan, Mianyang
[2] Robot Technology Used for Special Environment Key Laboratory of Sichuan Provincial, Southwest University of Science and Technology, Sichuan, Mianyang
关键词
efficient channel attention; feature extraction; feature fusion; GhostNet; head pose estimation;
D O I
10.3778/j.issn.1002-8331.2204-0069
中图分类号
TQ [化学工业];
学科分类号
0817 ;
摘要
Aiming at the problems of poor real-time performance and low recognition rate of existing head pose estimation algorithms in complex scenes, a three-stream head pose estimation algorithm based on multi-stage feature fusion is proposed. The algorithm has a multi-level output structure. Three different types of networks are used to extract features from the input image, and each branch has three stages. Each stage only needs to refine the features of previous stage. Feature map extracted at the same stage is generated by the feature fusion module, which effectively avoids the problem of feature loss. The feature extraction module selects the Ghost module as the feature extraction network, and uses model compression to reduce network parameters and computation while ensuring network accuracy. In order to extract more important and effective features, an efficient channel attention module ECA-Net is introduced to improve the accuracy of head pose estimation. Experimental results show that the proposed algorithm achieves excellent performance on both the AFLW2000 dataset and the BIWI dataset, with a model size of only 0.55 MB and a reduced MAE of 4.68 and 3.59 on the AFLW2000 and BIWI datasets respectively, compared to many current head pose estimation methods. © 2023 Journal of Computer Engineering and Applications Beijing Co., Ltd.; Science Press. All rights reserved.
引用
收藏
页码:212 / 222
页数:10
相关论文
共 23 条
[1]  
ALIOUA N, AMINE A, ROGOZAN A, Et al., Driver head pose estimation using efficient descriptor fusion[J], EURASIP Journal on Image and Video Processing, 1, pp. 1-14, (2016)
[2]  
MURPHY- CHUTORIAN E, TRIVEDI M M., Head pose estimation in computer vision:a survey[J], IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 4, pp. 607-626, (2008)
[3]  
CAO K, RONG Y, LI C, Et al., Pose-robust face recognition via deep residual equivariant mapping[C], Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5187-5196, (2018)
[4]  
KAZEMI V, SULLIVAN J., One millisecond face alignment with an ensemble of regression trees[C], Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867-1874, (2014)
[5]  
ZHU X, LEI Z,, LIU X, Et al., Face alignment across large poses:a 3D solution[C], Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 146-155, (2016)
[6]  
BULAT A, TZIMIROPOULOS G., How far are we from solving the 2D & 3D face alignment problem?(and a dataset of 230,000 3D facial landmarks), IEEE International Conference on Computer Vision, (2017)
[7]  
KUMAR A, ALAVI A, CHELLAPPA R., KEPLER:simultaneous estimation of keypoints and 3D pose of unconstrained faces in a unified framework by learning efficient H-CNN regressors[J], Image and Vision Computing, pp. 49-62, (2018)
[8]  
XIA J, WANG Q Z,, Et al., Face recognition based on local adaptive ternary derivative pattern coupled with Gabor feature[J], Laser & Optoelectronics Progress, 53, 11, pp. 110-116, (2016)
[9]  
AHN B, CHOI D G,, PARK J, Et al., Real-time head pose estimation using multi- task deep neural network[J], Robotics and Autonomous Systems, pp. 1-12, (2018)
[10]  
RUIZ N, CHONG E, REHG J M., Fine- grained head pose estimation without keypoints[J], (2017)