American Sign Language Words Recognition Using Spatio-Temporal Prosodic and Angle Features: A Sequential Learning Approach

被引：36

作者：

Abdullahi, Sunusi Bala ^{[1
,2
]}

Chamnongthai, Kosin ^{[3
]}

机构：

[1] King Mongkuts Univeristy Technol Thonburi, Dept Comp Engn, Fac Engn, Bangkok 10140, Thailand

[2] Nigeria Police, Force Criminal Invest & Intelligence Dept, Abuja 900211, Nigeria

[3] King Mongkuts Univ Technol Thonburi, Dept Elect & Telecommun Engn, Fac Engn, Bangkok 10140, Thailand

来源：

IEEE ACCESS | 2022年 / 10卷

关键词：

Three-dimensional displays; Hidden Markov models; Gesture recognition; Heuristic algorithms; Assistive technologies; Sensors; Image segmentation; American sign language; deep learning; fast fisher vector; hand gesture recognition; leap motion controller; orientation angles; spatio-temporal sequence; ubiquitous computing; GESTURE RECOGNITION;

D O I：

10.1109/ACCESS.2022.3148132

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Most of the available American Sign Language (ASL) words share similar characteristics. These characteristics are usually during sign trajectory which yields similarity issues and hinders ubiquitous application. However, recognition of similar ASL words confused translation algorithms, which lead to misclassification. In this paper, based on fast fisher vector (FFV) and bi-directional Long-Short Term memory (Bi-LSTM) method, a large database of dynamic sign words recognition algorithm called bidirectional long-short term memory-fast fisher vector (FFV-Bi-LSTM) is designed. This algorithm is designed to train 3D hand skeletal information of motion and orientation angle features learned from the leap motion controller (LMC). Each bulk features in the 3D video frame is concatenated together and represented as an high-dimensional vector using FFV encoding. Evaluation results demonstrate that the FFV-Bi-LSTM algorithm is suitable for accurately recognizing dynamic ASL words on basis of prosodic and angle cues. Furthermore, comparison results demonstrate that FFV-Bi-LSTM can provide better recognition accuracy of 98.6% and 91.002% for randomly selected ASL dictionary and 10 pairs of similar ASL words, in leave-one-subject-out cross-validation on the constructed dataset. The performance of our FFV-Bi-LSTM is further evaluated on ASL data set, leap motion dynamic hand gestures data set (LMDHG), and Semaphoric hand gestures contained in the Shape Retrieval Contest (SHREC) dataset. We improve the accuracy of the ASL data set, LMDHG, and SHREC data sets by 2%, 2%, and 3.19% respectively.

引用

页码：15911 / 15923

页数：13

共 74 条

[1] A Ubiquitous WiFi-Based Fine-Grained Gesture Recognition System [J].