Real-time 6DoF full-range markerless head pose estimation

被引:5
|
作者
Algabri, Redhwan [1 ]
Shin, Hyunsoo [2 ]
Lee, Sungon [3 ]
机构
[1] Hanyang Univ, Res Inst Engn & Technol, Ansan 15588, South Korea
[2] Hanyang Univ, Dept Elect & Elect Engn, Ansan 15588, South Korea
[3] Hanyang Univ, Dept Robot, Ansan 15588, South Korea
基金
新加坡国家研究基金会;
关键词
Head pose estimation; Full-range angles; 6DoF poses; Landmark-free; Deep learning;
D O I
10.1016/j.eswa.2023.122293
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Head pose estimation is a fundamental function for several applications in human-computer interactions. Accurate six degrees of freedom head pose estimation (6DoF-HPE) with full-range angles make up most of these applications, which require sequential images of the human head as input. Most existing head pose estimation methods focus on a three degrees of freedom (3DoF) frontal head, which restricts their applications in real-world scenarios. This study presents a framework designed to estimate a head pose without landmark localization. The novelty of our framework is to estimate the 6DoF head poses under full-range angles in real-time. The proposed framework leverages deep neural networks to detect human heads and predict their angles using single shot multibox detector (SSD) and RepVGG-b1g4 backbone, respectively. This work uses red, green, blue, and depth (RGB-D) data to estimate the rotational and translational components relative to the camera pose. The proposed framework employs a continuous representation to predict the angles and a multi-loss approach to update the loss functions for the training strategy. The regression function combines the geodesic loss with the mean squared error. The ground-truth labels were extracted from the public dataset Carnegie Mellon university (CMU) Panoptic for full head angles. This study provides a comprehensive comparison with state-of-the-art methods using public benchmark datasets. Experiments demonstrate that the proposed method achieves or outperforms state-of-the-art methods. The code and datasets are available at: (https://github.com/Redhwan-A/6DoFHPE).
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Object aspect classification and 6DoF pose estimation
    Dede, Muhammet Ali
    Genc, Yakup
    IMAGE AND VISION COMPUTING, 2022, 124
  • [2] REAL-TIME MONOCULAR 6-DOF HEAD POSE ESTIMATION FROM SALIENT 2D POINTS
    Barros, Jilliam Maria Diaz
    Garcia, Frederic
    Mirbach, Bruno
    Stricker, Didier
    2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 121 - 125
  • [3] Real-time Head Pose Estimation Based on Face Geometry
    Hosamani, Aditya
    Phirke, Manoj
    PROCEEDINGS OF 2020 5TH INTERNATIONAL CONFERENCE ON MULTIMEDIA AND IMAGE PROCESSING (ICMIP 2020), 2020, : 38 - 42
  • [4] Real-Time Head Pose Estimation Framework for Mobile Devices
    Jin Kim
    Gyun Hyuk Lee
    Jason J. Jung
    Kwang Nam Choi
    Mobile Networks and Applications, 2017, 22 : 634 - 641
  • [5] Real-Time Head Pose Estimation Framework for Mobile Devices
    Kim, Jin
    Lee, Gyun Hyuk
    Jung, Jason J.
    Choi, Kwang Nam
    MOBILE NETWORKS & APPLICATIONS, 2017, 22 (04) : 634 - 641
  • [6] A Survey of 6DoF Object Pose Estimation Methods for Different Application Scenarios
    Guan, Jian
    Hao, Yingming
    Wu, Qingxiao
    Li, Sicong
    Fang, Yingjian
    SENSORS, 2024, 24 (04)
  • [7] Depth Data Filtering for Real-time Head Pose Estimation with Kinect
    Qiao Ti-zhou
    Dai Shu-ling
    2013 6TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING (CISP), VOLS 1-3, 2013, : 953 - 958
  • [8] Robust Real-time Head Pose Estimation for 10 Watt SBC
    Wassef E.
    El Munim H.E.A.
    Hammad S.
    Ghoneima M.
    International Journal of Advanced Computer Science and Applications, 2021, 12 (07): : 578 - 585
  • [9] Dynamic random regression forests for real-time head pose estimation
    Ying, Ying
    Wang, Han
    MACHINE VISION AND APPLICATIONS, 2013, 24 (08) : 1705 - 1719
  • [10] Dynamic random regression forests for real-time head pose estimation
    Ying Ying
    Han Wang
    Machine Vision and Applications, 2013, 24 : 1705 - 1719