Unconstrained head pose estimation based on bilateral attention

被引：0

作者：

Zhang, Xiao ^{[1
]}

Yan, Chunman ^{[2
,3
]}

机构：

[1] Northwest Normal Univ, Coll Phys & Elect Engn, Lanzhou 730070, Peoples R China

[2] Northwest Normal Univ, Coll Phys, Lanzhou 730070, Peoples R China

[3] Northwest Normal Univ, Elect Engn Res Ctr Gansu Prov Intelligent Informat, Lanzhou 730070, Peoples R China

来源：

SIGNAL IMAGE AND VIDEO PROCESSING | 2025年 / 19卷 / 05期

关键词：

Head pose estimation; Ghost module; Attention mechanism; Bilinear pooling; Multivariate loss function;

D O I：

10.1007/s11760-025-03925-y

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Head pose estimation is a challenging and critical research topic, with existing models still facing significant challenges. First, common representations for head pose estimation exhibit discontinuities. Second, recognition rates are low in complex scenes, and models tend to have high parameter counts and substantial computational demands. To solve these problems, this paper proposes an unconstrained head pose estimation model based on bilinear attention. We introduce a 6D rotation matrix for attitude angle representation and a P-Ghost module to enhance the GhostNetV2 lightweight framework for feature extraction. A bilinear attention network is also introduced to integrate spatial and channel information, enabling the model to learn feature correlations, prioritize key channels, and suppress redundant ones. Multiple loss function strategies are also introduced to improve the model's accuracy. The proposed network model undergoes extensive testing on three datasets, with experimental results showing superior performance in head pose estimation.

引用

页数：11

共 31 条

[21] Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry [J].

Wu, Cho-Ying ;

Xu, Qiangeng ;

Neumann, Ulrich .

2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021), 2021, :453-463

[22] EVA-GCN: Head Pose Estimation Based on Graph Convolutional Networks [J].

Xin, Miao ;

Mo, Shentong ;

Lin, Yuanze .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, :1462-1471

[23] Head Pose Estimation Based on Multi-Level Feature Fusion [J].

Yan, Chunman ;

Zhang, Xiao .

INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2024, 38 (02)

[24] FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation from a Single Image [J].

Yang, Tsun-Yi ;

Chen, Yi-Ting ;

Lin, Yen-Yu ;

Chuang, Yung-Yu .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :1087-1096

[25]

Zeng Z., 2022, INT C PATTERN RECOGN

[26]

Zhang H, 2020, AAAI CONF ARTIF INTE, V34, P12789

[27] SA-NET: SHUFFLE ATTENTION FOR DEEP CONVOLUTIONAL NEURAL NETWORKS [J].

Zhang, Qing-Long ;

Yang, Yu-Bin .

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, :2235-2239

[28]

Zhou Y., 2020, Whenet: Real-time fine-grained estimation for wide range head pose

[29] On the Continuity of Rotation Representations in Neural Networks [J].

Zhou, Yi ;

Barnes, Connelly ;

Lu, Jingwan ;

Yang, Jimei ;

Li, Hao .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5738-5746

[30] Face Alignment Across Large Poses: A 3D Solution [J].

Zhu, Xiangyu ;

Lei, Zhen ;

Liu, Xiaoming ;

Shi, Hailin ;

Li, Stan Z. .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :146-155

← 1 2 3 4 →