Helping Hearing-Impaired in Emergency Situations: A Deep Learning-Based Approach

被引:23
作者
Areeb, Qazi Mohammad [1 ]
Maryam [1 ]
Nadeem, Mohammad [1 ]
Alroobaea, Roobaea [2 ]
Anwer, Faisal [1 ]
机构
[1] Aligarh Muslim Univ, Dept Comp Sci, Aligarh 202002, Uttar Pradesh, India
[2] Taif Univ, Coll Comp & Informat Technol, Dept Comp Sci, At Taif 21944, Saudi Arabia
关键词
Human-computer interaction; sign language; hand gesture recognition; classification; object detection; HAND GESTURE RECOGNITION; MEDICAL DIAGNOSIS; SIGN; VISION;
D O I
10.1109/ACCESS.2022.3142918
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hearing-impaired people use sign language to express their thoughts and emotions and reinforce information delivered in daily conversations. Though they make a significant percentage of any population, the majority of people can't interact with them due to limited or no knowledge of sign languages. Sign language recognition aims to detect the significant motions of the human body, especially hands, analyze them and understand them. Such systems may become life-saving when hearing-challenged people are in desperate situations like heart attacks, accidents, etc. In the present study, deep learning-based hand gesture recognition models are developed to accurately predict the emergency signs of Indian Sign Language (ISL). The dataset used contains the videos for eight different emergency situations. Several frames were extracted from the videos and are fed to three different models. Two models are designed for classification, while one is an object detection model, applied after annotating the frames. The first model consists of a three-dimensional convolutional neural network (3D CNN), while the second comprises of a pre-trained VGG-16 and a recurrent neural network with a long short-term memory (RNN-LSTM) scheme. The last model is based on YOLO (You Only Look Once) v5, an advanced object detection algorithm. The prediction accuracies of the classification models were 82% and 98%, respectively. YOLO based model outperformed the rest and achieved an impressive mean average precision of 99.6%.
引用
收藏
页码:8502 / 8517
页数:16
相关论文
共 75 条
[1]  
Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[2]   Hand gestures for emergency situations: A video dataset based on words from Indian sign language [J].
Adithya, V. ;
Rajesh, R. .
DATA IN BRIEF, 2020, 31
[3]   A framework of human action recognition using length control features fusion and weighted entropy-variances based feature selection [J].
Afza, Farhat ;
Khan, Muhammad Attique ;
Sharif, Muhammad ;
Kadry, Seifedine ;
Manogaran, Gunasekaran ;
Saba, Tanzila ;
Ashraf, Imran ;
Damasevicius, Robertas .
IMAGE AND VISION COMPUTING, 2021, 106
[4]   MH UNet: A Multi-Scale Hierarchical Based Architecture for Medical Image Segmentation [J].
Ahmad, Parvez ;
Jin, Hai ;
Alroobaea, Roobaea ;
Qamar, Saqib ;
Zheng, Ran ;
Alnajjar, Fady ;
Aboudi, Fathia .
IEEE ACCESS, 2021, 9 :148384-148408
[5]   Deep Learning-Based Approach for Sign Language Gesture Recognition With Efficient Hand Gesture Representation [J].
Al-Hammadi, Muneer ;
Muhammad, Ghulam ;
Abdul, Wadood ;
Alsulaiman, Mansour ;
Bencherif, Mohammed A. ;
Alrayes, Tareq S. ;
Mathkour, Hassan ;
Mekhtiche, Mohamed Amine .
IEEE ACCESS, 2020, 8 :192527-192542
[6]  
Anantha Rao G., 2018, 2nd International Conference on Micro-Electronics, Electromagnetics and Telecommunications, ICMEET 2016. Proceedings: LNEE 434, P31, DOI 10.1007/978-981-10-4280-5_4
[7]  
[Anonymous], 2012, INT J ENG ADV TECHNO
[8]  
[Anonymous], KINECT GESTURE
[9]  
[Anonymous], 2016, Convolutional Neural Networks for Visual Recognition
[10]   Real-Time Static and Dynamic Gesture Recognition Using Mixed Space Features for 3D Virtual World's Interactions [J].
Arachchi, S. P. Kasthuri ;
Hakim, Noorkholis Luthfil ;
Hsu, Hui-Huang ;
Klimenko, Stanislav Vladimirovich ;
Shih, Timothy K. .
2018 32ND INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS WORKSHOPS (WAINA), 2018, :627-632