Deep Learning-Based Sign Language Recognition System for Cognitive Development

被引:0
作者
Maher Jebali
Abdesselem Dakhli
Wided Bakari
机构
[1] University of Tunis,LaTICE
[2] University of Sfax,REGIM
[3] University of Sfax,MIR@CL
来源
Cognitive Computation | 2023年 / 15卷
关键词
Sign language recognition; Deep-learning; Recurrent neural network; Head pose;
D O I
暂无
中图分类号
学科分类号
摘要
Information in sign language (SL) is transmitted in large part by the movement, positioning, and shape of the hands as well as body language and facial emotions. Systems that recognize sign language can help with the problem that sign language is not widely used despite the large number of individuals who need to use it, and they can give hearing-impaired and deaf people a more practical way of life, employment, and education. Despite the fact that facial features are treated to be fundamental for humans to comprehend sign language, few earlier research work have inspected their cognitive importance for automatic SL recognition systems. To address this problem, this paper comes up with a novel manual and non-manual gesture recognition framework (MNM-VGG16) for the deaf and mute people. The framework employs a convolutional neural network, renowned as VGG-16 net, for implementing a trained model on an amply used video dataset by employing a component that learns the Multimodal Spatial Representation (MSR) of various modalities. The Multimodal Temporal Representation (MTR) component shapes temporal corrections from independent and dependent pathways to analyze the cooperation of different modalities. A cooperative optimization scheme, summarized by the employment of multi-scale perception component, is applied to make the finest of various modalities sources for sign language recognition. To validate the efficiency of MNM-VGG16, we carried out experiments on three large-scale sign language benchmarks: CSL Split II, SIGNUM, and RWTH-PHOENIX-Weather 2014. Experimental results prove that the suggested framework reaches new state-of-the-art achievement on all three benchmarks, and this attainment is noted by the reduction of the word error rate (WER) on test set by 14.2%, 13.7%, and 11.2%, respectively. In this paper, we offer the MNM-VGG16 hybrid method, which recognizes SL words by combining manual and non-manual features. This method demonstrates the significance of jointly modeling various body parts for SL recognition.
引用
收藏
页码:2189 / 2201
页数:12
相关论文
共 11 条
[1]  
Hochreiter S(1997)Long short-term memory Neural Comput. 9 1735-80
[2]  
Schmidhuber J(2015)Continuous sign language recognition: towards large vocabulary statistical recognition systems handling multiple signers Comput Vis Image Understand. 141 108-25
[3]  
Koller O(2020)EM-sign: A non-contact recognition method based on 24 GHz doppler radar for continuous signs and dialogues Electronics. 9 1577-63
[4]  
Forster J(2011)Combining diverse on-line and off-line systems for handwritten text line recognition Patt Recog. 42 3254-undefined
[5]  
Ney H(undefined)undefined undefined undefined undefined-undefined
[6]  
Ye L(undefined)undefined undefined undefined undefined-undefined
[7]  
Lan S(undefined)undefined undefined undefined undefined-undefined
[8]  
Zhang K(undefined)undefined undefined undefined undefined-undefined
[9]  
Zhang G(undefined)undefined undefined undefined undefined-undefined
[10]  
Liwicki M(undefined)undefined undefined undefined undefined-undefined