Multi-scale skeleton simplification graph convolutional network for skeleton-based action recognition

被引:0
作者
Fan, Zhang [1 ]
Ding, Chongyang [1 ]
Kai, Liu [1 ]
Liu, Hongjin [2 ]
机构
[1] Xidian Univ, Sch Comp Sci & Technol, Xian, Peoples R China
[2] SunWise Space Technol, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
computer vision; convolution; feature extraction; neural net architecture; neural nets;
D O I
10.1049/cvi2.12300
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human action recognition based on graph convolutional networks (GCNs) is one of the hotspots in computer vision. However, previous methods generally rely on handcrafted graph, which limits the effectiveness of the model in characterising the connections between indirectly connected joints. The limitation leads to weakened connections when joints are separated by long distances. To address the above issue, the authors propose a skeleton simplification method which aims to reduce the number of joints and the distance between joints by merging adjacent joints into simplified joints. Group convolutional block is devised to extract the internal features of the simplified joints. Additionally, the authors enhance the method by introducing multi-scale modelling, which maps inputs into sequences across various levels of simplification. Combining with spatial temporal graph convolution, a multi-scale skeleton simplification GCN for skeleton-based action recognition (M3S-GCN) is proposed for fusing multi-scale skeleton sequences and modelling the connections between joints. Finally, M3S-GCN is evaluated on five benchmarks of NTU RGB+D 60 (C-Sub, C-View), NTU RGB+D 120 (X-Sub, X-Set) and NW-UCLA datasets. Experimental results show that the authors' M3S-GCN achieves state-of-the-art performance with the accuracies of 93.0%, 97.0% and 91.2% on C-Sub, C-View and X-Set benchmarks, which validates the effectiveness of the method. The authors propose a multi-scale skeleton simplification graph convolutional network (M3S-GCN) for skeleton-based action recognition. The model leverages skeleton simplification and multi-scale modelling to effectively capture the intricate connections between the joints, and achieves state-of-the-art performance on three benchmarks, the NTU RGB+D C-Sub, NTU RGB+D C-View and NTU RGB+D 120 X-Set. image
引用
收藏
页码:992 / 1003
页数:12
相关论文
共 55 条
[1]   Skeleton Image Representation for 3D Action Recognition based on Tree Structure and Reference Joints [J].
Caetano, Carlos ;
Bremond, Francois ;
Schwartz, William Robson .
2019 32ND SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 2019, :16-23
[2]   Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition [J].
Chen, Yuxin ;
Zhang, Ziqi ;
Yuan, Chunfeng ;
Li, Bing ;
Deng, Ying ;
Hu, Weiming .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :13339-13348
[3]   Locomotion speed capability analysis of six-legged robots: Optimization and application [J].
Chen, Zhijun ;
Tian, Yuan ;
Gao, Feng ;
Liu, Jimu .
PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART C-JOURNAL OF MECHANICAL ENGINEERING SCIENCE, 2021, 235 (21) :5434-5449
[4]   Decoupling GCN with DropGraph Module for Skeleton-Based Action Recognition [J].
Cheng, Ke ;
Zhang, Yifan ;
Cao, Congqi ;
Shi, Lei ;
Cheng, Jian ;
Lu, Hanqing .
COMPUTER VISION - ECCV 2020, PT XXIV, 2020, 12369 :536-553
[5]   Skeleton-Based Action Recognition with Shift Graph Convolutional Network [J].
Cheng, Ke ;
Zhang, Yifan ;
He, Xiangyu ;
Chen, Weihan ;
Cheng, Jian ;
Lu, Hanqing .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :180-189
[6]   InfoGCN: Representation Learning for Human Skeleton-based Action Recognition [J].
Chi, Hyung-gun ;
Ha, Myoung Hoon ;
Chi, Seunggeun ;
Lee, Sang Wan ;
Huang, Qixing ;
Ramani, Karthik .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :20154-20164
[7]   Revisiting Skeleton-based Action Recognition [J].
Duan, Haodong ;
Zhao, Yue ;
Chen, Kai ;
Lin, Dahua ;
Dai, Bo .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :2959-2968
[8]  
Howard A. G., 2017, arXiv
[9]   Searching for MobileNetV3 [J].
Howard, Andrew ;
Sandler, Mark ;
Chu, Grace ;
Chen, Liang-Chieh ;
Chen, Bo ;
Tan, Mingxing ;
Wang, Weijun ;
Zhu, Yukun ;
Pang, Ruoming ;
Vasudevan, Vijay ;
Le, Quoc V. ;
Adam, Hartwig .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :1314-1324
[10]   Multiple Attention Mechanism Graph Convolution HAR Model Based on Coordination Theory [J].
Hu, Kai ;
Ding, Yiwu ;
Jin, Junlan ;
Xia, Min ;
Huang, Huaming .
SENSORS, 2022, 22 (14)