Isolated Arabic Sign Language Recognition Using a Transformer-based Model and Landmark Keypoints

被引：22

作者：

Alyami, Sarah ^{[1
,2
]}

Luqman, Hamzah ^{[1
,3
]}

Hammoudeh, Mohammad

机构：

[1] King Fahd Univ Petr & Minerals, Informat & Comp Sci Dept, Dhahran, Saudi Arabia

[2] Imam Abdulrahman Bin Faisal Univ, Appl Coll, Dammam, Saudi Arabia

[3] KFUPM, SDAIA KFUPM Joint Res Ctr Artificial Intelligence, Dhahran, Saudi Arabia

来源：

ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING | 2024年 / 23卷 / 01期

关键词：

Sign language recognition; arabic sign language; gesture recognition; pose recognition; TCN; transformer;

D O I：

10.1145/3584984

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Pose-based approaches for sign language recognition provide light-weight and fast models that can be adopted in real-time applications. This article presents a framework for isolated Arabic sign language recognition using hand and face keypoints. We employed MediaPipe pose estimator for extracting the keypoints of sign gestures in the video stream. Using the extracted keypoints, three models were proposed for sign language recognition: Long-Term Short Memory, Temporal Convolution Networks, and Transformer-based models. Moreover, we investigated the importance of non-manual features for sign language recognition systems and the obtained results showed that combining hand and face keypoints boosted the recognition accuracy by around 4% compared with only hand keypoints. The proposed models were evaluated on Arabic and Argentinian sign languages. Using the KArSL-100 dataset, the proposed pose-based Transformer achieved the highest accuracy of 99.74% and 68.2% in signer-dependent and -independent modes, respectively. Additionally, the Transformer was evaluated on the LSA64 dataset and obtained an accuracy of 98.25% and 91.09% in signer-dependent and -independent modes, respectively. Consequently, the pose-based Transformer outperformed the state-of-the-art techniques on both datasets using keypoints from the signer's hands and face.

引用

页数：19

共 64 条

[1] Arabic sign language: A perspective [J].