Sign and Human Action Detection Using Deep Learning

被引:7
作者
Dhulipala, Shivanarayna [1 ]
Adedoyin, Festus Fatai [1 ]
Bruno, Alessandro [2 ]
机构
[1] Bournemouth Univ, Dept Comp & Informat, Talbot Campus Poole, Poole BH12 5BB, Dorset, England
[2] Humanitas Univ, Dept Biomed Sci, Via Rita Levi Montalcini 4, I-20072 Milan, Italy
关键词
CNN; LSTM; confusion matrix; british sign language; precision; recall; LANGUAGE;
D O I
10.3390/jimaging8070192
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
Human beings usually rely on communication to express their feeling and ideas and to solve disputes among themselves. A major component required for effective communication is language. Language can occur in different forms, including written symbols, gestures, and vocalizations. It is usually essential for all of the communicating parties to be fully conversant with a common language. However, to date this has not been the case between speech-impaired people who use sign language and people who use spoken languages. A number of different studies have pointed out a significant gaps between these two groups which can limit the ease of communication. Therefore, this study aims to develop an efficient deep learning model that can be used to predict British sign language in an attempt to narrow this communication gap between speech-impaired and non-speech-impaired people in the community. Two models were developed in this research, CNN and LSTM, and their performance was evaluated using a multi-class confusion matrix. The CNN model emerged with the highest performance, attaining training and testing accuracies of 98.8% and 97.4%, respectively. In addition, the model achieved average weighted precession and recall of 97% and 96%, respectively. On the other hand, the LSTM model's performance was quite poor, with the maximum training and testing performance accuracies achieved being 49.4% and 48.7%, respectively. Our research concluded that the CNN model was the best for recognizing and determining British sign language.
引用
收藏
页数:34
相关论文
共 49 条
  • [1] Intelligent real-time Arabic sign language classification using attention-based inception and BiLSTM
    Abdul, Wadood
    Alsulaiman, Mansour
    Amin, Syed Umar
    Faisal, Mohammed
    Muhammad, Ghulam
    Albogamy, Fahad R.
    Bencherif, Mohamed A.
    Ghaleb, Hamid
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2021, 95
  • [2] Real time conversion of sign language to speech and prediction of gestures using Artificial Neural Network
    Abraham, Abey
    Rohini, V
    [J]. 8TH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING & COMMUNICATIONS (ICACC-2018), 2018, 143 : 587 - 594
  • [3] Albawi S, 2017, I C ENG TECHNOL
  • [4] Albert Florea G., 2019, DEEP LEARNING MODELS
  • [5] Attention-Inception and Long- Short-Term Memory-Based Electroencephalography Classification for Motor Imagery Tasks in Rehabilitation
    Amin, Syed Umar
    Altaheri, Hamdi
    Muhammad, Ghulam
    Abdul, Wadood
    Alsulaiman, Mansour
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (08) : 5412 - 5421
  • [6] [Anonymous], BRIT SIGN LANGUAGE
  • [7] [Anonymous], KAGGLE 2021
  • [8] Artificial Intelligence, Machine Learning, Deep Learning, and Cognitive Computing: What Do These Terms Mean and How Will They Impact Health Care?
    Bini, Stefano A.
    [J]. JOURNAL OF ARTHROPLASTY, 2018, 33 (08) : 2358 - 2361
  • [9] Robust Hand Gestures Recognition Using a Deep CNN and Thermal Images
    Breland, Daniel Skomedal
    Dayal, Aveen
    Jha, Ajit
    Yalavarthy, Phaneendra K.
    Pandey, Om Jee
    Cenkeramaddi, Linga Reddy
    [J]. IEEE SENSORS JOURNAL, 2021, 21 (23) : 26602 - 26614
  • [10] ATLASLang NMT: Arabic text language into Arabic sign language neural machine translation
    Brour, Mourad
    Benabbou, Abderrahim
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2021, 33 (09) : 1121 - 1131