Improved Shift Graph Convolutional Network for Action Recognition With Skeleton

被引：7

作者：

Li, Chuankun ^{[1
]}

Li, Shuai ^{[2
,3
]}

Gao, Yanbo ^{[2
,3
]}

Guo, Lina ^{[1
]}

Li, Wanqing ^{[4
]}

机构：

[1] North Univ China, Sch Informat & Commun Engn, State Key Lab Dynam Testing Technol, Taiyuan 030051, Peoples R China

[2] Shandong Univ, Sch Control Sci & Engn, Sch Software, Jinan 250100, Peoples R China

[3] Shandong Univ, Weihai Res Inst Ind Technol, Weihai 264209, Peoples R China

[4] Univ Wollongong, Adv Multimedia Res Lab, Wollongong, NSW 2522, Australia

来源：

IEEE SIGNAL PROCESSING LETTERS | 2023年 / 30卷

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

Convolution; Skeleton; Computational complexity; Feature extraction; Convolutional neural networks; Kernel; Correlation; Action recognition; shift-GCN; skeleton;

D O I：

10.1109/LSP.2023.3267975

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Shift graph convolutional network (Shift-GCN) achieves remarkable performance for skeleton based action recognition with lower computational complexity than other GCN based methods. However, the current Shift-GCN, with one spatial shift, a static mask and a local temporal convolution, cannot fully explore the spatial-temporal features among skeleton joints of different frames. In order to address these problems, an improved shift graph convolutional network (Ishift-GCN) is proposed in this letter. The Ishift-GCN consists of two parts including a bidirectional spatial shift graph convolution with a dynamic mask, and a multi-scale temporal shift graph convolution. The bidirectional spatial shift graph convolution exploits more spatial information among joints, and the dynamic mask with stronger generalization ability can learn different correlations among features of different joints for different actions. The multi-scale temporal shift graph convolution captures more temporal information by complementing the shifted features with multi-scale convolution. Furthermore, knowledge distillation is used to reduce computational complexity. Compared with Shift-GCN, the proposed Ishift-GCN achieves better results with less computation complexity on two widely used benchmarks, namely the NTU-RGB+D and UAV-Human dataset.

引用

页码：438 / 442

页数：5

共 40 条

[1] Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition [J].

Chen, Yuxin ;

Zhang, Ziqi ;

Yuan, Chunfeng ;

Li, Bing ;

Deng, Ying ;

Hu, Weiming .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :13339-13348

[2] Extremely Lightweight Skeleton-Based Action Recognition With ShiftGCN plus [J].

Cheng, Ke ;

Zhang, Yifan ;

He, Xiangyu ;

Cheng, Jian ;

Lu, Hanqing .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :7333-7348

[3] Skeleton-Based Action Recognition with Shift Graph Convolutional Network [J].

Cheng, Ke ;

Zhang, Yifan ;

He, Xiangyu ;

Chen, Weihan ;

Cheng, Jian ;

Lu, Hanqing .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :180-189

[4] InfoGCN: Representation Learning for Human Skeleton-based Action Recognition [J].

Chi, Hyung-gun ;

Ha, Myoung Hoon ;

Chi, Seunggeun ;

Lee, Sang Wan ;

Huang, Qixing ;

Ramani, Karthik .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :20154-20164

[5] Skeleton-Based Action Recognition With Focusing-Diffusion Graph Convolutional Networks [J].

Gao, Jialin ;

He, Tong ;

Zhou, Xi ;

Ge, Shiming .

IEEE SIGNAL PROCESSING LETTERS, 2021, 28 :2058-2062

[6] Meta-Learning Paradigm and CosAttn for Streamer Action Recognition in Live Video [J].

He, Chen ;

Zhang, Jing ;

Yao, Jiacheng ;

Zhuo, Li ;

Tian, Qi .

IEEE SIGNAL PROCESSING LETTERS, 2022, 29 :1097-1101

[7]

Hinton G, 2015, Arxiv, DOI [arXiv:1503.02531, DOI 10.48550/ARXIV.1503.02531]

[8] Skeleton Optical Spectra-Based Action Recognition Using Convolutional Neural Networks [J].

Hou, Yonghong ;

Li, Zhaoyang ;

Wang, Pichao ;

Li, Wanqing .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (03) :807-811

[9]

Ke Cheng, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12369), P536, DOI 10.1007/978-3-030-58586-0_32

[10] A New Representation of Skeleton Sequences for 3D Action Recognition [J].

Ke, Qiuhong ;

Bennamoun, Mohammed ;

An, Senjian ;

Sohel, Ferdous ;

Boussaid, Farid .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4570-4579

← 1 2 3 4 →