3D Head Pose Estimation via Normal Maps: A Generalized Solution for Depth Image, Point Cloud, and Mesh

被引:0
作者
Wu, Jiang [1 ,2 ,3 ]
Chen, Hua [1 ,2 ,3 ,4 ]
机构
[1] Chinese Acad Sci, Beijing Inst Genom, Beijing 100101, Peoples R China
[2] China Natl Ctr Bioinformat, Beijing 100101, Peoples R China
[3] Univ Chinese Acad Sci, Sch Future Technol, Beijing 100049, Peoples R China
[4] Chinese Acad Sci, CAS Ctr Excellence Anim Evolut & Genet, Kunming 650223, Peoples R China
基金
中国国家自然科学基金;
关键词
3D head pose estimation; deep learning; generalized Procrustes analysis; head rigid registration;
D O I
10.1002/aisy.202400159
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Head pose estimation plays a crucial role in various applications, including human-machine interaction, autonomous driving systems, and 3D reconstruction. Current methods address the problem primarily from a 2D perspective, which limits the efficient utilization of 3D information. Herein, a novel approach, called pose orientation-aware network (POANet), which leverages normal maps for orientation information embedding, providing abundant and robust head pose information, is introduced. POANet incorporates the axial signal perception module and the rotation matrix perception module, these lightweight modules make the approach achieve state-of-the-art (SOTA) performance with few computational costs. This method can directly analyze various topological 3D data without extensive preprocessing. For depth images, POANet outperforms existing methods on the Biwi Kinect head pose dataset, reducing the mean absolute error (MAE) by approximate to 30% compared to the SOTA methods. POANet is the first method to perform rigid head registration in a landmark-free manner. It also incorporates few-shot learning capabilities and achieves an MAE of about 1 degrees$1<^>{\circ}$ on the Headspace dataset. These features make POANet a superior alternative to traditional generalized Procrustes analysis for mesh data processing, offering enhanced convenience for human phenotype studies. Pose orientation-aware network (POANet), a lightweight model for 3D head pose estimation that analyzes diverse topological 3D data, is introduced. POANet outperforms state-of-the-art method on the BIWI depth image dataset, reducing the mean absolute error by approximate to 30%. Additionally, it is the first solution to perform rigid head registration in a landmark-free manner on mesh data.image (c) 2024 WILEY-VCH GmbH
引用
收藏
页数:11
相关论文
共 37 条
  • [1] Real-Time 3D Head Pose Tracking Through 2.5D Constrained Local Models with Local Neural Fields
    Ackland, Stephen
    Chiclana, Francisco
    Istance, Howell
    Coupland, Simon
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2019, 127 (6-7) : 579 - 598
  • [2] Baltrusaitis T, 2012, PROC CVPR IEEE, P2610, DOI 10.1109/CVPR.2012.6247980
  • [3] BESL PJ, 1992, P SOC PHOTO-OPT INS, V1611, P586, DOI 10.1117/12.57955
  • [4] Large Scale 3D Morphable Models
    Booth, James
    Roussos, Anastasios
    Ponniah, Allan
    Dunaway, David
    Zafeiriou, Stefanos
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2018, 126 (2-4) : 233 - 254
  • [5] POSEidon: Face-from-Depth for Driver Pose Estimation
    Borghi, Guido
    Venturelli, Marco
    Vezzani, Roberto
    Cucchiara, Rita
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5494 - 5503
  • [6] A Vector-based Representation to Enhance Head Pose Estimation
    Cao, Zhiwen
    Chu, Zongcheng
    Liu, Dongfang
    Chen, Yingjie
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1187 - 1196
  • [7] 2D Image head pose estimation via latent space regression under occlusion settings
    Celestino, Jose
    Marques, Manuel
    Nascimento, Jacinto C.
    Costeira, Joao Paulo
    [J]. PATTERN RECOGNITION, 2023, 137
  • [8] Connecting Gaze, Scene, and Attention: Generalized Attention Estimation via Joint Modeling of Gaze and Scene Saliency
    Chong, Eunji
    Ruiz, Nataniel
    Wang, Yongxin
    Zhang, Yun
    Rozga, Agata
    Rehg, James M.
    [J]. COMPUTER VISION - ECCV 2018, PT V, 2018, 11209 : 397 - 412
  • [9] Statistical Modeling of Craniofacial Shape and Texture
    Dai, Hang
    Pears, Nick
    Smith, William
    Duncan, Christian
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (02) : 547 - 571
  • [10] Dosovitskiy A., 2021, INT C LEARNING REPRE, DOI DOI 10.48550/ARXIV.2010.11929