Human skeleton pose and spatio-temporal feature-based activity recognition using ST-GCN

被引:24
作者
Lovanshi, Mayank [1 ]
Tiwari, Vivek [1 ,2 ]
机构
[1] Int Inst Informat Technol IIIT, Naya Raipur, India
[2] ABV Indian Inst Informat Technol & Management, Gwalior, India
关键词
Activity recognition; Pose estimation; ST-GCN; Spatio-temporal feature; Skeleton joints; SPATIAL-DISTRIBUTION; UNIFIED FRAMEWORK; GRADIENTS; MODEL;
D O I
10.1007/s11042-023-16001-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Skeleton-based Human Activity Recognition has recently sparked a lot of attention because skeleton data has proven resistant to changes in lighting, body sizes, dynamic camera perspectives, and complicated backgrounds. The Spatial-Temporal Graph Convolutional Networks (ST-GCN) model has been exposed to study spatial and temporal dependencies effectively from skeleton data. However, efficient use of 3D skeleton in-depth information remains a significant challenge, specifically for human joint motion patterns and linkages information. This study attempts a promising solution through a custom ST-GCN model and skeleton joints for human activity recognition. Special attention was given to spatial & temporal features, which were further fed to the classification model for better pose estimation. A comparative study is presented for activity recognition using large-scale databases such as NTU-RGB-D, Kinetics-Skeleton, and Florence 3D datasets. The Custom ST-GCN model outperforms (Top-1 accuracy) the state-of-the-art method on NTU-RGB-D, Kinetics-Skeleton & Florence 3D dataset with a higher margin by 0.7%, 1.25%, and 1.92%, respectively. Similarly, with Top-5 accuracy, the Custom ST-GCN model offers results hike by 0.5%, 0.73% & 1.52%, respectively. It shows that the presented graph-based topologies capture the changing aspects of a motion-based skeleton sequence better than some of the other approaches.
引用
收藏
页码:12705 / 12730
页数:26
相关论文
共 52 条
[1]   An efficient human action recognition framework with pose-based spatiotemporal features [J].
Agahian, Saeid ;
Negin, Farhood ;
Kose, Cemal .
ENGINEERING SCIENCE AND TECHNOLOGY-AN INTERNATIONAL JOURNAL-JESTECH, 2020, 23 (01) :196-203
[2]   Sensitive Integration of Multilevel Optimization Model in Human Activity Recognition for Smartphone and Smartwatch Applications [J].
Al-Janabi, Samaher ;
Salman, Ali Hamza .
BIG DATA MINING AND ANALYTICS, 2021, 4 (02) :124-138
[3]  
[Anonymous], 2015, 2015 11 IEEE INT C W
[4]   An interactive and low-cost full body rehabilitation framework based on 3D immersive serious games [J].
Avola, Danilo ;
Cinque, Luigi ;
Foresti, Gian Luca ;
Marini, Marco Raoul .
JOURNAL OF BIOMEDICAL INFORMATICS, 2019, 89 :81-100
[5]   Activity Recognition Using ST-GCN with 3D Motion Data [J].
Cao, Xin ;
Kudo, Wataru ;
Ito, Chihiro ;
Shuzo, Masaki ;
Maeda, Eisaku .
UBICOMP/ISWC'19 ADJUNCT: PROCEEDINGS OF THE 2019 ACM INTERNATIONAL JOINT CONFERENCE ON PERVASIVE AND UBIQUITOUS COMPUTING AND PROCEEDINGS OF THE 2019 ACM INTERNATIONAL SYMPOSIUM ON WEARABLE COMPUTERS, 2019, :689-692
[6]   Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields [J].
Cao, Zhe ;
Simon, Tomas ;
Wei, Shih-En ;
Sheikh, Yaser .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1302-1310
[7]   Skeleton-based action recognition with extreme learning machines [J].
Chen, Xi ;
Koskela, Markus .
NEUROCOMPUTING, 2015, 149 :387-396
[8]  
CHUNDURU V, 2021, 2021 IEEE 9 REGION 1, P1, DOI DOI 10.1109/R10-HTC53172.2021.9641587
[9]   Part-wise Spatio-temporal Attention Driven CNN-based 3D Human Action Recognition [J].
Dhiman, Chhavi ;
Vishwakarma, Dinesh Kumar ;
Agarwal, Paras .
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (03)
[10]   View-Invariant Deep Architecture for Human Action Recognition Using Two-Stream Motion and Shape Temporal Dynamics [J].
Dhiman, Chhavi ;
Vishwakarma, Dinesh Kumar .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :3835-3844