Temporal segment graph convolutional networks for skeleton-based action recognition

被引:32
作者
Ding, Chongyang [1 ]
Wen, Shan [1 ]
Ding, Wenwen [2 ]
Liu, Kai [1 ]
Belyaev, Evgeny [3 ]
机构
[1] Xidian Univ, Sch Comp Sci & Technol, Xian, Peoples R China
[2] Huaibei Normal Univ, Sch Math Sci, Huaibei, Peoples R China
[3] ITMO Univ, Int Lab Comp Technol, St Petersburg, Russia
关键词
Skeleton sequence; Temporal misalignment; Action recognition; Temporal segment; Graph construction;
D O I
10.1016/j.engappai.2022.104675
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Different actions usually emphasize on different parts of a skeleton, even for a specific action, different action stages have the corresponding emphases. Previous studies generally construct the human skeletons as predefined, thus lacking the adaptability to different action modes. In addition, these methods simply employ the padding or truncation operation on the skeleton sequence to fix the sequence length, resulting in additional temporal misalignment problem. In this work, we propose a novel temporal segment graph convolutional networks (TS-GCN) for skeleton-based action recognition. Our model divides the whole sequence into several subsequences. Then GCNs are applied on each subsequence to capture the dynamic information stage by stage, which can align the motion features in temporal domain. Besides, in order to explore the intrinsic features contained in each subsequence, our model introduces a graph-adaptive method to construct an individual graph that can be learned and updated from skeleton data for each subsequence, which increases the generality of graph construction to adapt to different sequences. Extensive experiments are conducted on two standard datasets, NTU-RGB+D and Kinetics. The experimental results demonstrate the effectiveness of the proposed method.
引用
收藏
页数:8
相关论文
共 45 条
[1]  
Atwood J, 2016, ADV NEUR IN, V29
[2]   Subject independent human action recognition using spatio-depth information and meta-cognitive RBF network [J].
Babu, R. Venkatesh ;
Savitha, R. ;
Suresh, S. ;
Agarwal, Bhuvnesh .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (09) :2010-2021
[3]   Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields [J].
Cao, Zhe ;
Simon, Tomas ;
Wei, Shih-En ;
Sheikh, Yaser .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1302-1310
[4]   A review of state-of-the-art techniques for abnormal human activity recognition [J].
Dhiman, Chhavi ;
Vishwakarma, Dinesh Kumar .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2019, 77 :21-45
[5]   Learning Spatiotemporal Features with 3D Convolutional Networks [J].
Du Tran ;
Bourdev, Lubomir ;
Fergus, Rob ;
Torresani, Lorenzo ;
Paluri, Manohar .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4489-4497
[6]  
Du Y, 2015, PROC CVPR IEEE, P1110, DOI 10.1109/CVPR.2015.7298714
[7]  
Duvenaudt D, 2015, ADV NEUR IN, V28
[8]  
Fernando B, 2015, PROC CVPR IEEE, P5378, DOI 10.1109/CVPR.2015.7299176
[9]   Temporal Localization of Actions with Actoms [J].
Gaidon, Adrien ;
Harchaoui, Zaid ;
Schmid, Cordelia .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (11) :2782-2795
[10]  
Hamilton WL, 2017, ADV NEUR IN, V30