Optimized hand pose estimation CrossInfoNet-based architecture for embedded devices

被引:0
作者
Marek Šimoník
Michal Krumnikl
机构
[1] VŠB – Technical University of Ostrava,Department of Computer Science, FEECS
来源
Machine Vision and Applications | 2022年 / 33卷
关键词
Convolutional neural network; Feature extractor; Hand pose estimation;
D O I
暂无
中图分类号
学科分类号
摘要
We present CrossInfoMobileNet, a hand pose estimation convolutional neural network based on CrossInfoNet, specifically tuned to mobile phone processors through the optimization, modification, and replacement of computationally critical CrossInfoNet components. By introducing a state-of-the-art MobileNetV3 network as a feature extractor and refiner, replacing ReLU activation with a better performing H-Swish activation function, we have achieved a network that requires 2.37 times less multiply-add operations and 2.22 times less parameters than the CrossInfoNet network, while maintaining the same error on the state-of-the-art datasets. This reduction of multiply-add operations resulted in an average 1.56 times faster real-world performance on both desktop and mobile devices, making it more suitable for embedded applications. The full source code of CrossInfoMobileNet including the sample dataset and its evaluation is available online through Code Ocean.
引用
收藏
相关论文
共 48 条
  • [1] Chen X(2020)Pose guided structured region ensemble network for cascaded hand pose estimation Neurocomputing 395 138-149
  • [2] Wang G(2018)Shpr-net: Deep semantic hand pose regression from point clouds IEEE Access 6 43425-43439
  • [3] Guo H(2001)3d articulated models and multiview tracking with physical forces Comput. Vis. Image Underst. 81 328-357
  • [4] Zhang C(2007)Vision-based hand pose estimation: a review Comput. Vis. Image Understanding 108 52-73
  • [5] Chen X(2019)Interactive hand pose estimation using a stretch-sensing soft glove ACM Trans. Graph. 10 3322957-219
  • [6] Wang G(2015)Depth-images-based pose estimation using regression forests and graphical models Neurocomputing 164 210-53081
  • [7] Zhang C(2021)Road scene recognition of forklift agv equipment based on deep learning Processes 9 1955-169:10
  • [8] Kim TK(2020)Light and fast hand pose estimation from spatial-decomposed latent heatmap IEEE Access 8 53072-414
  • [9] Ji X(2014)Real-time continuous pose recovery of human hands using convolutional networks ACM Trans. Graph. 33 169:1-6011
  • [10] Delamarre Q(2018)Region ensemble network: towards good practices for deep 3d hand pose estimation J. Vis. Commun. Image Represent. 55 404-undefined