FastHand: Fast monocular hand pose estimation on embedded systems

被引:15
作者
An, Shan [1 ]
Zhang, Xiajie [2 ]
Wei, Dong [2 ]
Zhu, Haogang [1 ]
Yang, Jianyu [3 ]
Tsintotas, Konstantinos A. [4 ]
机构
[1] Beihang Univ, Sch Comp Sci & Engn, Beijing 100191, Peoples R China
[2] JD COM Inc, Tech & Data Ctr, Beijing 100108, Peoples R China
[3] Soochow Univ, Sch Rail Transportat, Suzhou 215006, Peoples R China
[4] Democritus Univ Thrace, Dept Prod & Management Engn, Xanthi 67132, Greece
基金
中国国家自然科学基金;
关键词
Hand pose estimation; Landmark localization; Hand detection; Encoder-decoder network; Heatmap regression; GESTURE RECOGNITION;
D O I
10.1016/j.sysarc.2021.102361
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Hand pose estimation is a fundamental task in many human-robot interaction-related applications. However, previous approaches suffer from unsatisfying hand landmark predictions in real-world scenes and high computation burden. This paper proposes a fast and accurate framework for hand pose estimation, dubbed as "FastHand". Using a lightweight encoder-decoder network architecture, FastHand fulfills the requirements of practical applications running on embedded devices. The encoder consists of deep layers with a small number of parameters, while the decoder uses spatial location information to obtain more accurate results. The evaluation took place on two publicly available datasets demonstrating the improved performance of the proposed pipeline compared to other state-of-the-art approaches. FastHand offers high accuracy scores while reaching a speed of 25 frames per second on an NVIDIA Jetson TX2 graphics processing unit.
引用
收藏
页数:8
相关论文
共 45 条
[1]  
Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[2]  
[Anonymous], 2017, ICIP
[3]   3D Hand Shape and Pose from Images in the Wild [J].
Boukhayma, Adnane ;
de Bem, Rodrigo ;
Torr, Philip H. S. .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :10835-10844
[4]   Face tracking and hand gesture recognition for human-robot interaction [J].
Brèthes, L ;
Menezes, P ;
Lerasle, F ;
Hayet, J .
2004 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1- 5, PROCEEDINGS, 2004, :1901-1906
[5]   Weakly-Supervised 3D Hand Pose Estimation from Monocular RGB Images [J].
Cai, Yujun ;
Ge, Liuhao ;
Cai, Jianfei ;
Yuan, Junsong .
COMPUTER VISION - ECCV 2018, PT VI, 2018, 11210 :678-694
[6]   Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields [J].
Cao, Zhe ;
Simon, Tomas ;
Wei, Shih-En ;
Sheikh, Yaser .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1302-1310
[7]  
Chen YF, 2020, IEEE WINT CONF APPL, P370, DOI [10.1109/WACV45572.2020.9093271, 10.1109/wacv45572.2020.9093271]
[8]   Xception: Deep Learning with Depthwise Separable Convolutions [J].
Chollet, Francois .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1800-1807
[9]   A Kinect-based Gesture Recognition Approach for a Natural Human Robot Interface [J].
Cicirelli, Grazia ;
Attolico, Carmela ;
Guaragnella, Cataldo ;
D'Orazio, Tiziana .
INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2015, 12
[10]   Deep Learning for Hand Gesture Recognition on Skeletal Data [J].
Devineau, Guillaume ;
Xi, Wang ;
Moutarde, Fabien ;
Yang, Jie .
PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, :106-113