American Sign Language Words Recognition Using Spatio-Temporal Prosodic and Angle Features: A Sequential Learning Approach

被引:36
作者
Abdullahi, Sunusi Bala [1 ,2 ]
Chamnongthai, Kosin [3 ]
机构
[1] King Mongkuts Univeristy Technol Thonburi, Dept Comp Engn, Fac Engn, Bangkok 10140, Thailand
[2] Nigeria Police, Force Criminal Invest & Intelligence Dept, Abuja 900211, Nigeria
[3] King Mongkuts Univ Technol Thonburi, Dept Elect & Telecommun Engn, Fac Engn, Bangkok 10140, Thailand
关键词
Three-dimensional displays; Hidden Markov models; Gesture recognition; Heuristic algorithms; Assistive technologies; Sensors; Image segmentation; American sign language; deep learning; fast fisher vector; hand gesture recognition; leap motion controller; orientation angles; spatio-temporal sequence; ubiquitous computing; GESTURE RECOGNITION;
D O I
10.1109/ACCESS.2022.3148132
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Most of the available American Sign Language (ASL) words share similar characteristics. These characteristics are usually during sign trajectory which yields similarity issues and hinders ubiquitous application. However, recognition of similar ASL words confused translation algorithms, which lead to misclassification. In this paper, based on fast fisher vector (FFV) and bi-directional Long-Short Term memory (Bi-LSTM) method, a large database of dynamic sign words recognition algorithm called bidirectional long-short term memory-fast fisher vector (FFV-Bi-LSTM) is designed. This algorithm is designed to train 3D hand skeletal information of motion and orientation angle features learned from the leap motion controller (LMC). Each bulk features in the 3D video frame is concatenated together and represented as an high-dimensional vector using FFV encoding. Evaluation results demonstrate that the FFV-Bi-LSTM algorithm is suitable for accurately recognizing dynamic ASL words on basis of prosodic and angle cues. Furthermore, comparison results demonstrate that FFV-Bi-LSTM can provide better recognition accuracy of 98.6% and 91.002% for randomly selected ASL dictionary and 10 pairs of similar ASL words, in leave-one-subject-out cross-validation on the constructed dataset. The performance of our FFV-Bi-LSTM is further evaluated on ASL data set, leap motion dynamic hand gestures data set (LMDHG), and Semaphoric hand gestures contained in the Shape Retrieval Contest (SHREC) dataset. We improve the accuracy of the ASL data set, LMDHG, and SHREC data sets by 2%, 2%, and 3.19% respectively.
引用
收藏
页码:15911 / 15923
页数:13
相关论文
共 74 条
[1]   A Ubiquitous WiFi-Based Fine-Grained Gesture Recognition System [J].
Abdelnasser, Heba ;
Harras, Khaled ;
Youssef, Moustafa .
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2019, 18 (11) :2474-2487
[2]   Based on wearable sensory device in 3D-printed humanoid: A new real-time sign language recognition system [J].
Ahmed, M. A. ;
Zaidan, B. B. ;
Zaidan, A. A. ;
Salih, Mahmood M. ;
Al-qaysi, Z. T. ;
Alamoodi, A. H. .
MEASUREMENT, 2021, 168
[3]   A Review on Systems-Based Sensory Gloves for Sign Language Recognition State of the Art between 2007 and 2017 [J].
Ahmed, Mohamed Aktham ;
Zaidan, Bilal Bahaa ;
Zaidan, Aws Alaa ;
Salih, Mahmood Maher ;
Bin Lakulu, Muhammad Modi .
SENSORS, 2018, 18 (07)
[4]   A comparison of Arabic sign language dynamic gesture recognition models [J].
Almasre, Miada A. ;
Al-Nuaim, Hana .
HELIYON, 2020, 6 (03)
[5]   A novel hybrid bidirectional unidirectional LSTM network for dynamic hand gesture recognition with Leap Motion [J].
Ameur, Safa ;
Ben Khalifa, Anouar ;
Bouhlel, Med Salim .
ENTERTAINMENT COMPUTING, 2020, 35
[6]  
[Anonymous], 2018, SIGN LANGUAGE EVERYO
[7]   LieToMe: Preliminary study on hand gestures for deception detection via Fisher-LSTM [J].
Avola, Danilo ;
Cinque, Luigi ;
De Marsico, Maria ;
Fagioli, Alessio ;
Foresti, Gian Luca .
PATTERN RECOGNITION LETTERS, 2020, 138 :455-461
[8]   Exploiting Recurrent Neural Networks and Leap Motion Controller for the Recognition of Sign Language and Semaphoric Hand Gestures [J].
Avola, Danilo ;
Bernardi, Marco ;
Cinque, Luigi ;
Foresti, Gian Luca ;
Massaroni, Cristiano .
IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (01) :234-245
[9]   Trajectory-based recognition of dynamic Persian sign language using hidden Markov model [J].
Azar, Saeideh Ghanbari ;
Seyedarabi, Hadi .
COMPUTER SPEECH AND LANGUAGE, 2020, 61
[10]  
Battison Robbin, 1978, Lexical Borrowing in American Sign Language