Continuous Sign Language Recognition Based on Spatial-Temporal Graph Attention Network

被引:5
作者
Guo, Qi [1 ]
Zhang, Shujun [1 ]
Li, Hui [1 ]
机构
[1] Qingdao Univ Sci & Technol, Coll Informat Sci & Technol, Qingdao 266061, Peoples R China
来源
CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES | 2023年 / 134卷 / 03期
关键词
Continuous sign language recognition; graph attention network; bidirectional long short-term memory; connectionist temporal classification;
D O I
10.32604/cmes.2022.021784
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Continuous sign language recognition (CSLR) is challenging due to the complexity of video background, hand gesture variability, and temporal modeling difficulties. This work proposes a CSLR method based on a spatial-temporal graph attention network to focus on essential features of video series. The method considers local details of sign language movements by taking the information on joints and bones as inputs and constructing a spatial-temporal graph to reflect inter-frame relevance and physical connections between nodes. The graph-based multi-head attention mechanism is utilized with adjacent matrix calculation for better local-feature exploration, and short-term motion correlation modeling is completed via a temporal convolutional network. We adopted BLSTM to learn the long-term dependence and connectionist temporal classification to align the word-level sequences. The proposed method achieves competitive results regarding word error rates (1.59%) on the Chinese Sign Language dataset and the mean Jaccard Index (65.78%) on the ChaLearn LAP Continuous Gesture Dataset.
引用
收藏
页码:1653 / 1670
页数:18
相关论文
共 46 条
[41]   Multimodal Spatiotemporal Networks for Sign Language Recognition [J].
Zhang, Shujun ;
Meng, Weijia ;
Li, Hui ;
Cui, Xuehong .
IEEE ACCESS, 2019, 7 :180270-180280
[42]   Improved Breast Cancer Classification Through Combining Graph Convolutional Network and Convolutional Neural Network [J].
Zhang, Yu-Dong ;
Satapathy, Suresh Chandra ;
Guttery, David S. ;
Manuel Gorriz, Juan ;
Wang, Shui-Hua .
INFORMATION PROCESSING & MANAGEMENT, 2021, 58 (02)
[43]  
Zhou H, 2020, AAAI CONF ARTIF INTE, V34, P13009
[44]   DYNAMIC PSEUDO LABEL DECODING FOR CONTINUOUS SIGN LANGUAGE RECOGNITION [J].
Zhou, Hao ;
Zhou, Wengang ;
Li, Houqiang .
2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, :1282-1287
[45]   Self-Attention-Based Fully-Inception Networks for Continuous Sign Language Recognition [J].
Zhou, Mingjie ;
Ng, Michael ;
Cai, Zixin ;
Cheung, Ka Chun .
ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 :2832-2839
[46]   Continuous Gesture Segmentation and Recognition Using 3DCNN and Convolutional LSTM [J].
Zhu, Guangming ;
Zhang, Liang ;
Shen, Peiyi ;
Song, Juan ;
Shah, Syed Afaq Ali ;
Bennamoun, Mohammed .
IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (04) :1011-1021