Deep Learning-Based Hand Gesture Recognition System and Design of a Human–Machine Interface

被引:0
作者
Abir Sen
Tapas Kumar Mishra
Ratnakar Dash
机构
[1] National Institute of Technology,Department of Computer Science and Engineering
来源
Neural Processing Letters | 2023年 / 55卷
关键词
Deep learning; Hand gesture recognition; Segmentation; Vision transformer; Kalman filter; Human machine interface; Transfer learning; Virtual mouse;
D O I
暂无
中图分类号
学科分类号
摘要
Hand gesture recognition plays an important role in developing effective human–machine interfaces (HMIs) that enable direct communication between humans and machines. But in real-time scenarios, it is difficult to identify the correct hand gesture to control an application while moving the hands. To address this issue, in this work, a low-cost hand gesture recognition system based human-computer interface (HCI) is presented in real-time scenarios. The system consists of six stages: (1) hand detection, (2) gesture segmentation, (3) feature extraction and gesture classification using five pre-trained convolutional neural network models (CNN) and vision transformer (ViT), (4) building an interactive human–machine interface (HMI), (5) development of a gesture-controlled virtual mouse, (6) smoothing of virtual mouse pointer using of Kalman filter. In our work, five pre-trained CNN models (VGG16, VGG19, ResNet50, ResNet101, and Inception-V1) and ViT have been employed to classify hand gesture images. Two multi-class datasets (one public and one custom) have been used to validate the models. Considering the model’s performances, it is observed that Inception-V1 has significantly shown a better classification performance compared to the other four CNN models and ViT in terms of accuracy, precision, recall, and F-score values. We have also expanded this system to control some multimedia applications (such as VLC player, audio player, playing 2D Super-Mario-Bros game, etc.) with different customized gesture commands in real-time scenarios. The average speed of this system has reached 25 fps (frames per second), which meets the requirements for the real-time scenario. Performance of the proposed gesture control system obtained the average response time in milisecond for each control which makes it suitable for real-time. This model (prototype) will benefit physically disabled people interacting with desktops.
引用
收藏
页码:12569 / 12596
页数:27
相关论文
共 49 条
[31]  
Dwivedi A(undefined)undefined undefined undefined undefined-undefined
[32]  
Reis TJ(undefined)undefined undefined undefined undefined-undefined
[33]  
Polegato PH(undefined)undefined undefined undefined undefined-undefined
[34]  
Becker M(undefined)undefined undefined undefined undefined-undefined
[35]  
Caurin GA(undefined)undefined undefined undefined undefined-undefined
[36]  
Liarokapis M(undefined)undefined undefined undefined undefined-undefined
[37]  
Tsai T-H(undefined)undefined undefined undefined undefined-undefined
[38]  
Huang C-C(undefined)undefined undefined undefined undefined-undefined
[39]  
Zhang K-L(undefined)undefined undefined undefined undefined-undefined
[40]  
Chen Z-h(undefined)undefined undefined undefined undefined-undefined