3D Head Pose Estimation via Normal Maps: A Generalized Solution for Depth Image, Point Cloud, and Mesh

被引：0

作者：

Wu, Jiang ^{[1
,2
,3
]}

Chen, Hua ^{[1
,2
,3
,4
]}

机构：

[1] Chinese Acad Sci, Beijing Inst Genom, Beijing 100101, Peoples R China

[2] China Natl Ctr Bioinformat, Beijing 100101, Peoples R China

[3] Univ Chinese Acad Sci, Sch Future Technol, Beijing 100049, Peoples R China

[4] Chinese Acad Sci, CAS Ctr Excellence Anim Evolut & Genet, Kunming 650223, Peoples R China

来源：

ADVANCED INTELLIGENT SYSTEMS | 2024年 / 6卷 / 11期

基金：

中国国家自然科学基金;

关键词：

3D head pose estimation; deep learning; generalized Procrustes analysis; head rigid registration;

D O I：

10.1002/aisy.202400159

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Head pose estimation plays a crucial role in various applications, including human-machine interaction, autonomous driving systems, and 3D reconstruction. Current methods address the problem primarily from a 2D perspective, which limits the efficient utilization of 3D information. Herein, a novel approach, called pose orientation-aware network (POANet), which leverages normal maps for orientation information embedding, providing abundant and robust head pose information, is introduced. POANet incorporates the axial signal perception module and the rotation matrix perception module, these lightweight modules make the approach achieve state-of-the-art (SOTA) performance with few computational costs. This method can directly analyze various topological 3D data without extensive preprocessing. For depth images, POANet outperforms existing methods on the Biwi Kinect head pose dataset, reducing the mean absolute error (MAE) by approximate to 30% compared to the SOTA methods. POANet is the first method to perform rigid head registration in a landmark-free manner. It also incorporates few-shot learning capabilities and achieves an MAE of about 1 degrees$1<^>{\circ}$ on the Headspace dataset. These features make POANet a superior alternative to traditional generalized Procrustes analysis for mesh data processing, offering enhanced convenience for human phenotype studies. Pose orientation-aware network (POANet), a lightweight model for 3D head pose estimation that analyzes diverse topological 3D data, is introduced. POANet outperforms state-of-the-art method on the BIWI depth image dataset, reducing the mean absolute error by approximate to 30%. Additionally, it is the first solution to perform rigid head registration in a landmark-free manner on mesh data.image (c) 2024 WILEY-VCH GmbH

引用

页数：11

共 37 条

[1] Real-Time 3D Head Pose Tracking Through 2.5D Constrained Local Models with Local Neural Fields
Ackland, Stephen
Chiclana, Francisco
Istance, Howell
Coupland, Simon
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2019, 127 (6-7) : 579 - 598
[2] Baltrusaitis T, 2012, PROC CVPR IEEE, P2610, DOI 10.1109/CVPR.2012.6247980
[3] BESL PJ, 1992, P SOC PHOTO-OPT INS, V1611, P586, DOI 10.1117/12.57955
[4] Large Scale 3D Morphable Models
Booth, James
Roussos, Anastasios
Ponniah, Allan
Dunaway, David
Zafeiriou, Stefanos
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2018, 126 (2-4) : 233 - 254
[5] POSEidon: Face-from-Depth for Driver Pose Estimation
Borghi, Guido
Venturelli, Marco
Vezzani, Roberto
Cucchiara, Rita
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5494 - 5503
[6] A Vector-based Representation to Enhance Head Pose Estimation
Cao, Zhiwen
Chu, Zongcheng
Liu, Dongfang
Chen, Yingjie
[J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1187 - 1196
[7] 2D Image head pose estimation via latent space regression under occlusion settings
Celestino, Jose
Marques, Manuel
Nascimento, Jacinto C.
Costeira, Joao Paulo
[J]. PATTERN RECOGNITION, 2023, 137
[8] Connecting Gaze, Scene, and Attention: Generalized Attention Estimation via Joint Modeling of Gaze and Scene Saliency
Chong, Eunji
Ruiz, Nataniel
Wang, Yongxin
Zhang, Yun
Rozga, Agata
Rehg, James M.
[J]. COMPUTER VISION - ECCV 2018, PT V, 2018, 11209 : 397 - 412
[9] Statistical Modeling of Craniofacial Shape and Texture
Dai, Hang
Pears, Nick
Smith, William
Duncan, Christian
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (02) : 547 - 571
[10] Dosovitskiy A., 2021, INT C LEARNING REPRE, DOI DOI 10.48550/ARXIV.2010.11929

← 1 2 3 4 →