Isolated Arabic Sign Language Recognition Using a Transformer-based Model and Landmark Keypoints

被引:22
作者
Alyami, Sarah [1 ,2 ]
Luqman, Hamzah [1 ,3 ]
Hammoudeh, Mohammad
机构
[1] King Fahd Univ Petr & Minerals, Informat & Comp Sci Dept, Dhahran, Saudi Arabia
[2] Imam Abdulrahman Bin Faisal Univ, Appl Coll, Dammam, Saudi Arabia
[3] KFUPM, SDAIA KFUPM Joint Res Ctr Artificial Intelligence, Dhahran, Saudi Arabia
关键词
Sign language recognition; arabic sign language; gesture recognition; pose recognition; TCN; transformer;
D O I
10.1145/3584984
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pose-based approaches for sign language recognition provide light-weight and fast models that can be adopted in real-time applications. This article presents a framework for isolated Arabic sign language recognition using hand and face keypoints. We employed MediaPipe pose estimator for extracting the keypoints of sign gestures in the video stream. Using the extracted keypoints, three models were proposed for sign language recognition: Long-Term Short Memory, Temporal Convolution Networks, and Transformer-based models. Moreover, we investigated the importance of non-manual features for sign language recognition systems and the obtained results showed that combining hand and face keypoints boosted the recognition accuracy by around 4% compared with only hand keypoints. The proposed models were evaluated on Arabic and Argentinian sign languages. Using the KArSL-100 dataset, the proposed pose-based Transformer achieved the highest accuracy of 99.74% and 68.2% in signer-dependent and -independent modes, respectively. Additionally, the Transformer was evaluated on the LSA64 dataset and obtained an accuracy of 98.25% and 91.09% in signer-dependent and -independent modes, respectively. Consequently, the pose-based Transformer outperformed the state-of-the-art techniques on both datasets using keypoints from the signer's hands and face.
引用
收藏
页数:19
相关论文
共 64 条
[1]   Arabic sign language: A perspective [J].
Abdel-Fattah, MA .
JOURNAL OF DEAF STUDIES AND DEAF EDUCATION, 2005, 10 (02) :212-221
[2]   Intelligent real-time Arabic sign language classification using attention-based inception and BiLSTM [J].
Abdul, Wadood ;
Alsulaiman, Mansour ;
Amin, Syed Umar ;
Faisal, Mohammed ;
Muhammad, Ghulam ;
Albogamy, Fahad R. ;
Bencherif, Mohamed A. ;
Ghaleb, Hamid .
COMPUTERS & ELECTRICAL ENGINEERING, 2021, 95
[3]   Action recognition using kinematics posture feature on 3D skeleton joint locations [J].
Ahad, Md Atiqur Rahman ;
Ahmed, Masud ;
Das Antar, Anindya ;
Makihara, Yasushi ;
Yagi, Yasushi .
PATTERN RECOGNITION LETTERS, 2021, 145 (145) :216-224
[4]  
Al-Fityani K., 2010, Sign languages, P433, DOI [10.1017/CBO9780511712203.020, DOI 10.1017/CBO9780511712203.020]
[5]  
Al-Turki Yousef Bin Sultan, 2017, IUG J. Educ. Psychol. Stud., V25, P284, DOI [10.12816/0043439, DOI 10.12816/0043439]
[6]   Understanding vision-based continuous sign language recognition [J].
Aloysius, Neena ;
Geetha, M. .
MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (31-32) :22177-22209
[7]   DeepArSLR: A Novel Signer-Independent Deep Learning Framework for Isolated Arabic Sign Language Gestures Recognition [J].
Aly, Saleh ;
Aly, Walaa .
IEEE ACCESS, 2020, 8 :83199-83212
[8]   Trajectory-based recognition of dynamic Persian sign language using hidden Markov model [J].
Azar, Saeideh Ghanbari ;
Seyedarabi, Hadi .
COMPUTER SPEECH AND LANGUAGE, 2020, 61
[9]  
Bai SJ, 2018, Arxiv, DOI [arXiv:1803.01271, DOI 10.48550/ARXIV.1803.01271]
[10]  
Bazarevsky V, 2020, Arxiv, DOI [arXiv:2006.10204, DOI 10.48550/ARXIV.2006.10204]