Deep learning-based sign language recognition system using both manual and non-manual components fusion

被引:3
作者
Jebali, Maher [1 ]
Dakhli, Abdesselem [1 ]
Bakari, Wided [1 ]
机构
[1] Univ Hail, Comp Sci Dept, POB 2440, Hail 100190, Saudi Arabia
来源
AIMS MATHEMATICS | 2024年 / 9卷 / 01期
关键词
CNN; CTC; recurrent neural network; sign language recognition; head pose;
D O I
10.3934/math.2024105
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Sign language is regularly adopted by speech-impaired or deaf individuals to convey information; however, it necessitates substantial exertion to acquire either complete knowledge or skill. Sign language recognition (SLR) has the intention to close the gap between the users and the non-users of sign language by identifying signs from video speeches. This is a fundamental but arduous task as sign language is carried out with complex and often fast hand gestures and motions, facial expressions and impressionable body postures. Nevertheless, non-manual features are currently being examined since numerous signs have identical manual components but vary in non-manual components. To this end, we suggest a novel manual and non-manual SLR system (MNM-SLR) using a convolutional neural network (CNN) to get the benefits of multi-cue information towards a significant recognition rate. Specifically, we suggest a model for a deep convolutional, long short-term memory network that simultaneously exploits the non-manual features, which is summarized by utilizing the head pose, as well as a model of the embedded dynamics of manual features. Contrary to other frequent works that focused on depth cameras, multiple camera visuals and electrical gloves, we employed the use of RGB, which allows individuals to communicate with a deaf person through their personal devices. As a result, our framework achieves a high recognition rate with an accuracy of 90.12% on the SIGNUM dataset and 94.87% on RWTH-PHOENIX-Weather 2014 dataset.
引用
收藏
页码:2105 / 2122
页数:18
相关论文
共 46 条
[1]   Novel Spatio-Temporal Continuous Sign Language Recognition Using an Attentive Multi-Feature Network [J].
Aditya, Wisnu ;
Shih, Timothy K. ;
Thaipisutikul, Tipajin ;
Fitriajie, Arda Satata ;
Gochoo, Munkhjargal ;
Utaminingrum, Fitri ;
Lin, Chih-Yang .
SENSORS, 2022, 22 (17)
[2]  
[Anonymous], 2006, 2006 IEEE COMP VIS P, DOI DOI 10.1109/CVPR.2006.119
[3]  
Bicego M, 2023, Image Anal. Proc., P303, DOI [10.1007/978-3-031-43148-726, DOI 10.1007/978-3-031-43148-726]
[4]   Attention-Based CNN-RNN Arabic Text Recognition from Natural Scene Images [J].
Butt, Hanan ;
Raza, Muhammad Raheel ;
Ramzan, Muhammad Javed ;
Ali, Muhammad Junaid ;
Haris, Muhammad .
FORECASTING, 2021, 3 (03) :520-540
[5]   PoTion: Pose MoTion Representation for Action Recognition [J].
Choutas, Vasileios ;
Weinzaepfel, Philippe ;
Revaud, Jerome ;
Schmid, Cordelia .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7024-7033
[6]   Spatial-temporal transformer for end-to-end sign language recognition [J].
Cui, Zhenchao ;
Zhang, Wenbo ;
Li, Zhaoxin ;
Wang, Zhaoqi .
COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (04) :4645-4656
[7]   Automatic translation of sign language with multi-stream 3D CNN and generation of artificial depth maps [J].
de Castro, Giulia Zanon ;
Guerra, Rubia Reis ;
Guimaraes, Frederico Gadelha .
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 215
[8]   Score-Level Multi Cue Fusion for Sign Language Recognition [J].
Gokce, Cagri ;
Ozdemir, Ogulcan ;
Kindiroglu, Ahmet Alp ;
Akarun, Lale .
COMPUTER VISION - ECCV 2020 WORKSHOPS, PT II, 2020, 12536 :294-309
[9]  
Guo D, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P751
[10]  
Guo D, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P744