DS-HPE: Deep Set for Head Pose Estimation

被引：2

作者：

Menan, Velayuthan ^{[1
]}

Gawesha, Asiri ^{[1
]}

Samarasinghe, Pradeepa ^{[1
]}

Kasthurirathna, Dharshana ^{[1
]}

机构：

[1] Sri Lanka Inst Informat Technol, Fac Comp, Malabe, Sri Lanka

来源：

2023 IEEE 13TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE, CCWC | 2023年

关键词：

Head Pose Estimation; Deep Sets; Landmarkbased method;

D O I：

10.1109/CCWC57344.2023.10099159

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Head pose estimation is a critical task that is fundamental to a variety of real-world applications, such as virtual and augmented reality, as well as human behavior analysis. In the past, facial landmark-based methods were the dominant approach to head pose estimation. However, recent research has demonstrated the effectiveness of landmark-free methods, which have achieved state-of-the-art (SOTA) results. In this study, we utilize the Deep Set architecture for the first time in the domain of head pose estimation. Deep Set is a specialized architecture that works on a "set" of data as a result of the "permutation-invariance" operator being utilized in the model. As a result, the model is a simple yet powerful and edge-computation-friendly method for estimating head pose. We evaluate our proposed method on two benchmark data sets, and we compare our method against SOTA methods on a challenging video-based data set. Our results indicate that our proposed method not only achieves comparable accuracy to these SOTA methods but also requires less computational time. Furthermore, the simplicity of our proposed method allows for its deployment in resource-constrained environments without the need for expensive hardware such as Graphics Processing Units (GPUs). This work underscores the importance of accurate and resource-efficient head pose estimation in the fields of computer vision and human-computer interaction, and the Deep Set architecture presents a promising approach to achieving this goal.

引用

页码：1179 / 1184

页数：6

共 26 条

[1] Bazarevsky V, 2019, Arxiv, DOI arXiv:1907.05047
[2] Face-from-Depth for Head Pose Estimation on Depth Images
Borghi, Guido
Fabbri, Matteo
Vezzani, Roberto
Calderara, Simone
Cucchiara, Rita
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (03) : 596 - 609
[3] POSEidon: Face-from-Depth for Driver Pose Estimation
Borghi, Guido
Venturelli, Marco
Vezzani, Roberto
Cucchiara, Rita
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5494 - 5503
[4] Bulat Adrian, 2017, INT C COMPUTER VISIO
[5] A Vector-based Representation to Enhance Head Pose Estimation
Cao, Zhiwen
Chu, Zongcheng
Liu, Dongfang
Chen, Yingjie
[J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1187 - 1196
[6] Random Forests for Real Time 3D Face Analysis
Fanelli, Gabriele
Dantone, Matthias
Gall, Juergen
Fossati, Andrea
Van Gool, Luc
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2013, 101 (03) : 437 - 458
[7] Funes Mora K. A., 2014, P S EYE TRACK RES AP, P255, DOI DOI 10.1145/2578153.2578190
[8] Dynamic Facial Analysis: From Bayesian Filtering to Recurrent Neural Network
Gu, Jinwei
Yang, Xiaodong
De Mello, Shalini
Kautz, Jan
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1531 - 1540
[9] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[10] Kartynnik Y, 2019, Arxiv, DOI arXiv:1907.06724

← 1 2 3 →