Deep Learning Technology to Recognize American Sign Language Alphabet

被引:11
作者
Alsharif, Bader [1 ,2 ]
Altaher, Ali Salem [1 ]
Altaher, Ahmed [1 ,3 ]
Ilyas, Mohammad [1 ]
Alalwany, Easa [1 ,4 ]
机构
[1] Florida Atlantic Univ, Dept Elect Engn & Comp Sci, 777 Glades Rd, Boca Raton, FL 33431 USA
[2] Tech & Vocat Training Corp TVTC, Coll Telecommun & Informat, Dept Comp Sci & Engn, Riyadh 11564, Saudi Arabia
[3] Al Nahrain Univ, Elect Comp Ctr, Baghdad 64074, Iraq
[4] Taibah Univ, Coll Comp Sci & Engn, Yanbu 46421, Saudi Arabia
关键词
image-based; American sign language; deep learning; transfer learning; AlexNet; ConvNeXt; EfficientNet; ResNet-50; VisionTransformer;
D O I
10.3390/s23187970
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Historically, individuals with hearing impairments have faced neglect, lacking the necessary tools to facilitate effective communication. However, advancements in modern technology have paved the way for the development of various tools and software aimed at improving the quality of life for hearing-disabled individuals. This research paper presents a comprehensive study employing five distinct deep learning models to recognize hand gestures for the American Sign Language (ASL) alphabet. The primary objective of this study was to leverage contemporary technology to bridge the communication gap between hearing-impaired individuals and individuals with no hearing impairment. The models utilized in this research include AlexNet, ConvNeXt, EfficientNet, ResNet-50, and VisionTransformer were trained and tested using an extensive dataset comprising over 87,000 images of the ASL alphabet hand gestures. Numerous experiments were conducted, involving modifications to the architectural design parameters of the models to obtain maximum recognition accuracy. The experimental results of our study revealed that ResNet-50 achieved an exceptional accuracy rate of 99.98%, the highest among all models. EfficientNet attained an accuracy rate of 99.95%, ConvNeXt achieved 99.51% accuracy, AlexNet attained 99.50% accuracy, while VisionTransformer yielded the lowest accuracy of 88.59%.
引用
收藏
页数:20
相关论文
共 52 条
[1]   A Review on Systems-Based Sensory Gloves for Sign Language Recognition State of the Art between 2007 and 2017 [J].
Ahmed, Mohamed Aktham ;
Zaidan, Bilal Bahaa ;
Zaidan, Aws Alaa ;
Salih, Mahmood Maher ;
Bin Lakulu, Muhammad Modi .
SENSORS, 2018, 18 (07)
[2]  
Al Ani L.A., 2020, Al-Nahrain J. Sci, V23, P62, DOI [10.22401/ANJS.23.1.09, DOI 10.22401/ANJS.23.1.09]
[3]  
Al-Obodi A.H., 2020, Building Services Engineering Research and Technology, V13, P3328, DOI [10.37624/IJERT/13.11.2020.3328-3334, DOI 10.37624/IJERT/13.11.2020.3328-3334]
[4]  
Alanazi M., 2022, P 26 WORLD MULTICONF
[5]  
Alawwad RA, 2021, INT J ADV COMPUT SC, V12, P692
[6]   Location Privacy-Preserving Scheme in IoBT Networks Using Deception-Based Techniques [J].
Alkanjr, Basmh ;
Mahgoub, Imad .
SENSORS, 2023, 23 (06)
[7]   A Novel Deception-Based Scheme to Secure the Location Information for IoBT Entities [J].
Alkanjr, Basmh ;
Mahgoub, Imad .
IEEE ACCESS, 2023, 11 :15540-15554
[8]  
AlKhuraym BY, 2022, INT J ADV COMPUT SC, V13, P319
[9]  
Almasre M.A., 2016, Int. J. Adv. Eng. Manag. Sci, V2, P239469
[10]   A Real Time Arabic Sign Language Alphabets (ArSLA) Recognition Model Using Deep Learning Architecture [J].
Alsaadi, Zaran ;
Alshamani, Easa ;
Alrehaili, Mohammed ;
Alrashdi, Abdulmajeed Ayesh D. ;
Albelwi, Saleh ;
Elfaki, Abdelrahman Osman .
COMPUTERS, 2022, 11 (05)