Semantic-guided multi-scale human skeleton action recognition

被引:8
作者
Qi, Yongfeng [1 ]
Hu, Jinlin [1 ]
Zhuang, Liqiang [1 ]
Pei, Xiaoxu [1 ]
机构
[1] Northwest Normal Univ, Coll Comp Sci & Engn, Lanzhou 730070, Gansu, Peoples R China
关键词
Human skeleton; Action recognition; Semantic information; Multi-scale neural network; Multi-scale receptive field; GRAPH CONVOLUTIONAL NETWORKS; LSTM; FUSION; GCN;
D O I
10.1007/s10489-022-03968-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the development of depth sensors and pose estimation algorithms, action recognition technology based on the human skeleton has attracted wide attention from researchers. The human skeleton action recognition methods embedded with semantic information have excellent performance in terms of computational cost and recognition results by extracting spatio-temporal features of all joints, nevertheless, they will cause information redundancy and are of limitations in extracting long-term context spatio-temporal features. In this work, we propose a semantic-guided multi-scale neural network (SGMSN) method for skeleton action recognition. For spatial modeling, the key insight of our approach is to achieve multi-scale graph convolution by manipulating the data level (without adding additional computational cost). For temporal modeling, we build the multi-scale temporal convolutional network with a multi-scale receptive field across the temporal dimensions. Several experiments have been carried out on two publicly available large-scale skeleton datasets, NTU RGB+D and NTU RGB+D 120. On the NTU RGB+D datasets, the accuracy is 90.1% (cross-subject) and 95.8% (cross-view) respectively. The experimental results show that the performance of the proposed network architecture is superior to most current state-of-the-art action recognition models.
引用
收藏
页码:9763 / 9778
页数:16
相关论文
共 65 条
[1]  
Abu-El-Haifa S, 2019, PR MACH LEARN RES, V97
[2]   Action recognition using kinematics posture feature on 3D skeleton joint locations [J].
Ahad, Md Atiqur Rahman ;
Ahmed, Masud ;
Das Antar, Anindya ;
Makihara, Yasushi ;
Yagi, Yasushi .
PATTERN RECOGNITION LETTERS, 2021, 145 (145) :216-224
[3]   Skeleton-based action recognition using sparse spatio-temporal GCN with edge effective resistance [J].
Ahmad, Tasweer ;
Jin, Lianwen ;
Lin, Luojun ;
Tang, GuoZhi .
NEUROCOMPUTING, 2021, 423 :389-398
[4]   A Review on Computer Vision-Based Methods for Human Action Recognition [J].
Al-Faris, Mahmoud ;
Chiverton, John ;
Ndzi, David ;
Ahmed, Ahmed Isam .
JOURNAL OF IMAGING, 2020, 6 (06)
[5]   2-D Skeleton-Based Action Recognition via Two-Branch Stacked LSTM-RNNs [J].
Avola, Danilo ;
Cascio, Marco ;
Cinque, Luigi ;
Foresti, Gian Luca ;
Massaroni, Cristiano ;
Rodola, Emanuele .
IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (10) :2481-2496
[6]  
Bai RW, 2022, Arxiv, DOI arXiv:2109.02860
[7]   Fuzzy Integral-Based CNN Classifier Fusion for 3D Skeleton Action Recognition [J].
Banerjee, Avinandan ;
Singh, Pawan Kumar ;
Sarkar, Ram .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (06) :2206-2216
[8]   Skeleton-Based Action Recognition With Gated Convolutional Neural Networks [J].
Cao, Congqi ;
Lan, Cuiling ;
Zhang, Yifan ;
Zeng, Wenjun ;
Lu, Hanqing ;
Zhang, Yanning .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (11) :3247-3257
[9]   Skeleton-based action recognition with temporal action graph and temporal adaptive graph convolution structure [J].
Cao, Yi ;
Liu, Chen ;
Huang, Zilong ;
Sheng, Yongjian ;
Ju, Yongjian .
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (19) :29139-29162
[10]   GAS-GCN: Gated Action-Specific Graph Convolutional Networks for Skeleton-Based Action Recognition [J].
Chan, Wensong ;
Tian, Zhiqiang ;
Wu, Yang .
SENSORS, 2020, 20 (12) :1-13