Single Shot Detector CNN and Deep Dilated Masks for Vision-Based Hand Gesture Recognition From Video Sequences

被引：3

作者：

Al Farid, Fahmid ^{[1
]}

Hashim, Noramiza ^{[2
]}

Bin Abdullah, Junaidi ^{[2
]}

Bhuiyan, Md. Roman ^{[2
]}

Kairanbay, Magzhan ^{[3
]}

Yusoff, Zulfadzli ^{[1
]}

Karim, Hezerul Abdul ^{[1
]}

Mansor, Sarina ^{[1
]}

Sarker, MD. Tanjil ^{[1
]}

Ramasamy, Gobbi ^{[1
]}

机构：

[1] Multimedia Univ, Fac Engn, Cyberjaya 63100, Malaysia

[2] Multimedia Univ, Fac Comp & Informat, Cyberjaya 63100, Malaysia

[3] Suleyman Demirel Univ SDU, Fac Engn & Nat Sci, Alma Ata 32260, Kazakhstan

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Gesture recognition; Support vector machines; Human computer interaction; Streaming media; Multimedia computing; Convolutional neural networks; Deep learning; video sequences; SVM; SSD-CNN; deep dilated mask;

D O I：

10.1109/ACCESS.2024.3360857

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

With an increasing number of people on the planet today, innovative human-computer interaction technologies and approaches may be employed to assist individuals in leading more fulfilling lives. Gesture-based technology has the potential to improve the safety and well-being of impaired people, as well as the general population. Recognizing gestures from video streams is a difficult problem because of the large degree of variation in the characteristics of each motion across individuals. In this article, we propose applying deep learning methods to recognize automated hand gestures using RGB and depth data. To train neural networks to detect hand gestures, any of these forms of data may be utilized. Gesture-based interfaces are more natural, intuitive, and straightforward. Earlier study attempted to characterize hand motions in a number of contexts. Our technique is evaluated using a vision-based gesture recognition system. In our suggested technique, image collection starts with RGB video and depth information captured with the Kinect sensor and is followed by tracking the hand using a single shot detector Convolutional Neural Network (SSD-CNN). When the kernel is applied, it creates an output value at each of the m $\times $ n locations. Using a collection of convolutional filters, each new feature layer generates a defined set of gesture detection predictions. After that, we perform deep dilation to make the gesture in the image masks more visible. Finally, hand gestures have been detected using the well-known classification technique SVM. Using deep learning we recognize hand gestures with higher accuracy of 93.68% in RGB passage, 83.45% in the depth passage, and 90.61% in RGB-D conjunction on the SKIG dataset compared to the state-of-the-art. In the context of our own created Different Camera Orientation Gesture (DCOG) dataset we got higher accuracy of 92.78% in RGB passage, 79.55% in the depth passage, and 88.56% in RGB-D conjunction for the gestures collected in 0-degree angle. Moreover, the framework intends to use unique methodologies to construct a superior vision-based hand gesture recognition system.

引用

页码：28564 / 28574

页数：11

共 30 条

[21] MultiD-CNN: A multi-dimensional feature learning approach based on deep convolutional networks for gesture recognition in RGB-D image sequences
Elboushaki, Abdessamad
Hannane, Rachida
Afdel, Karim
Koutti, Lahcen
EXPERT SYSTEMS WITH APPLICATIONS, 2020, 139 (139)
[22] Dynamic Hand Gesture Recognition Based on Signals From Specialized Data Glove and Deep Learning Algorithms
Dong, Yongfeng
Liu, Jielong
Yan, Wenjie
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70
[23] Video Discharge Extractor: a Deep Learning and Computer Vision-based Framework for Surface Discharges Recognition on HV Lines Insulators
Maldarella, Alberto
Lami, Gabriele
Bionda, Enea
Tornelli, Carlo
Pirovano, Giovanni
Chiarello, Sergio L.
2022 IEEE 21ST MEDITERRANEAN ELECTROTECHNICAL CONFERENCE (IEEE MELECON 2022), 2022, : 831 - 836
[24] Implementation of Single Shot Detector (SSD) MobileNet V2 on Disabled Patient's Hand Gesture Recognition as a Notification System
Nurfirdausi, Annisaa F.
Soekirno, Santoso
Aminah, Siti
13TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS 2021), 2021, : 197 - 202
[25] VRFNet-ASLiT: Fused Deep CNN and Adaptive Super Resolution Transform-Based Hand Gesture Recognition
Kushwaha, Roli
Kumar, Manjeet
Kumar, Dinesh
IEEE SENSORS JOURNAL, 2024, 24 (18) : 28931 - 28940
[26] Deep Learning based Machine Vision: first steps towards a hand gesture recognition set up for Collaborative Robots
Nuzzi, Cristina
Pasinetti, Simone
Lancini, Matteo
Docchio, Franco
Sansoni, Giovanna
2018 IEEE INTERNATIONAL WORKSHOP ON METROLOGY FOR INDUSTRY 4.0 AND IOT (METROIND4.0&IOT), 2018, : 28 - 33
[27] Vision-based hand gesture recognition of alphabets, numbers, arithmetic operators and ASCII characters in order to develop a virtual text-entry interface system
Misra, Songhita
Singha, Joyeeta
Laskar, R. H.
NEURAL COMPUTING & APPLICATIONS, 2018, 29 (08) : 117 - 135
[28] In-situ recognition of hand gesture via Enhanced Xception based single-stage deep convolutional neural network
Bose, S. Rubin
Kumar, V. Sathiesh
EXPERT SYSTEMS WITH APPLICATIONS, 2022, 193
[29] RETRACTED: Human-computer interaction using vision-based hand gesture recognition systems: a survey(Retracted article. See vol. 28, pg.849, 2017)
Hasan, Haitham
Abdul-Kareem, Sameem
NEURAL COMPUTING & APPLICATIONS, 2014, 25 (02) : 251 - 261
[30] Deep Learning Approach for Human Action Recognition Using a Time Saliency Map Based on Motion Features Considering Camera Movement and Shot in Video Image Sequences
Alavigharahbagh, Abdorreza
Hajihashemi, Vahid
Machado, Jose J. M.
Tavares, Joao Manuel R. S.
Moscato, Vincenzo
INFORMATION, 2023, 14 (11)

← 1 2 3 →