An Effective Deep Network for Head Pose Estimation without Keypoints

被引:0
作者
Thai, Chien [1 ]
Tran, Viet [1 ]
Bui, Minh [1 ]
Ninh, Huong [1 ]
Tran, Hai [1 ]
机构
[1] Viettel Aerosp Inst, Optoelect Ctr, Comp Vis Dept, Hanoi, Vietnam
来源
PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM) | 2021年
关键词
Head Pose Estimation; Knowledge Distillation; Convolutional Neural Network; SUPPORT VECTOR MACHINES; FACE; CASCADE;
D O I
10.5220/0010870900003122
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human head pose estimation is an essential problem in facial analysis in recent years that has a lot of computer vision applications such as gaze estimation, virtual reality, driver assistance. Because of the importance of the head pose estimation problem, it is necessary to design a compact model to resolve this task in order to reduce the computational cost when deploying on facial analysis-based applications such as large camera surveillance systems, AI cameras while maintaining accuracy. In this work, we propose a lightweight model that effectively addresses the head pose estimation problem. Our approach has two main steps. 1) We first train many teacher models on the synthesis dataset - 300W-LPA to get the head pose pseudo labels. 2) We design an architecture with the ResNet18 backbone and train our proposed model with the ensemble of these pseudo labels via the knowledge distillation process. To evaluate the effectiveness of our model, we use AFLW-2000 and BIWI - two real-world head pose datasets. Experimental results show that our proposed model significantly improves the accuracy in comparison with the state-of-the-art head pose estimation methods. Furthermore, our model has the real-time speed of similar to 300 FPS when inferring on Tesla V100.
引用
收藏
页码:90 / 98
页数:9
相关论文
共 46 条
  • [1] BEYMER DJ, 1994, 1994 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, P756, DOI 10.1109/CVPR.1994.323893
  • [2] Face Alignment by Explicit Shape Regression
    Cao, Xudong
    Wei, Yichen
    Wen, Fang
    Sun, Jian
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2014, 107 (02) : 177 - 190
  • [3] FacePoseNet: Making a Case for Landmark-Free Face Alignment
    Chang, Feng-Ju
    Anh Tuan Tran
    Hassner, Tal
    Masi, Iacopo
    Nevatia, Ram
    Medioni, Gerard
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 1599 - 1608
  • [4] Chen D, 2014, LECT NOTES COMPUT SC, V8694, P109, DOI 10.1007/978-3-319-10599-4_8
  • [5] MODEL-BASED OBJECT POSE IN 25 LINES OF CODE
    DEMENTHON, DF
    DAVIS, LS
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 1995, 15 (1-2) : 123 - 141
  • [6] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [7] Relation Distillation Networks for Video Object Detection
    Deng, Jiajun
    Pan, Yingwei
    Yao, Ting
    Zhou, Wengang
    Li, Houqiang
    Mei, Tao
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 7022 - 7031
  • [8] Fanelli Gabriele, 2011, Pattern Recognition. Proceedings 33rd DAGM Symposium, P101, DOI 10.1007/978-3-642-23123-0_11
  • [9] Res2Net: A New Multi-Scale Backbone Architecture
    Gao, Shang-Hua
    Cheng, Ming-Ming
    Zhao, Kai
    Zhang, Xin-Yu
    Yang, Ming-Hsuan
    Torr, Philip
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (02) : 652 - 662
  • [10] Dynamic Facial Analysis: From Bayesian Filtering to Recurrent Neural Network
    Gu, Jinwei
    Yang, Xiaodong
    De Mello, Shalini
    Kautz, Jan
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1531 - 1540