FastHand: Fast monocular hand pose estimation on embedded systems

被引:14
作者
An, Shan [1 ]
Zhang, Xiajie [2 ]
Wei, Dong [2 ]
Zhu, Haogang [1 ]
Yang, Jianyu [3 ]
Tsintotas, Konstantinos A. [4 ]
机构
[1] Beihang Univ, Sch Comp Sci & Engn, Beijing 100191, Peoples R China
[2] JD COM Inc, Tech & Data Ctr, Beijing 100108, Peoples R China
[3] Soochow Univ, Sch Rail Transportat, Suzhou 215006, Peoples R China
[4] Democritus Univ Thrace, Dept Prod & Management Engn, Xanthi 67132, Greece
基金
中国国家自然科学基金;
关键词
Hand pose estimation; Landmark localization; Hand detection; Encoder-decoder network; Heatmap regression; GESTURE RECOGNITION;
D O I
10.1016/j.sysarc.2021.102361
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Hand pose estimation is a fundamental task in many human-robot interaction-related applications. However, previous approaches suffer from unsatisfying hand landmark predictions in real-world scenes and high computation burden. This paper proposes a fast and accurate framework for hand pose estimation, dubbed as "FastHand". Using a lightweight encoder-decoder network architecture, FastHand fulfills the requirements of practical applications running on embedded devices. The encoder consists of deep layers with a small number of parameters, while the decoder uses spatial location information to obtain more accurate results. The evaluation took place on two publicly available datasets demonstrating the improved performance of the proposed pipeline compared to other state-of-the-art approaches. FastHand offers high accuracy scores while reaching a speed of 25 frames per second on an NVIDIA Jetson TX2 graphics processing unit.
引用
收藏
页数:8
相关论文
共 45 条
  • [1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [2] [Anonymous], 2017, ICIP
  • [3] 3D Hand Shape and Pose from Images in the Wild
    Boukhayma, Adnane
    de Bem, Rodrigo
    Torr, Philip H. S.
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10835 - 10844
  • [4] Face tracking and hand gesture recognition for human-robot interaction
    Brèthes, L
    Menezes, P
    Lerasle, F
    Hayet, J
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1- 5, PROCEEDINGS, 2004, : 1901 - 1906
  • [5] Weakly-Supervised 3D Hand Pose Estimation from Monocular RGB Images
    Cai, Yujun
    Ge, Liuhao
    Cai, Jianfei
    Yuan, Junsong
    [J]. COMPUTER VISION - ECCV 2018, PT VI, 2018, 11210 : 678 - 694
  • [6] Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
    Cao, Zhe
    Simon, Tomas
    Wei, Shih-En
    Sheikh, Yaser
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1302 - 1310
  • [7] Chen YF, 2020, IEEE WINT CONF APPL, P370, DOI [10.1109/wacv45572.2020.9093271, 10.1109/WACV45572.2020.9093271]
  • [8] Xception: Deep Learning with Depthwise Separable Convolutions
    Chollet, Francois
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1800 - 1807
  • [9] A Kinect-based Gesture Recognition Approach for a Natural Human Robot Interface
    Cicirelli, Grazia
    Attolico, Carmela
    Guaragnella, Cataldo
    D'Orazio, Tiziana
    [J]. INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2015, 12
  • [10] Deep Learning for Hand Gesture Recognition on Skeletal Data
    Devineau, Guillaume
    Xi, Wang
    Moutarde, Fabien
    Yang, Jie
    [J]. PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, : 106 - 113