Skeleton-based action recognition using sparse spatio-temporal GCN with edge effective resistance

被引：27

作者：

Ahmad, Tasweer ^{[1
]}

Jin, Lianwen ^{[1
]}

Lin, Luojun ^{[1
]}

Tang, GuoZhi ^{[1
]}

机构：

[1] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 510000, Peoples R China

来源：

NEUROCOMPUTING | 2021年 / 423卷

关键词：

Graph convolutional neural networks; Graph sparsification; Self-attention graph pooling; SPARSIFICATION; NETWORK;

D O I：

10.1016/j.neucom.2020.10.096

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Graph convolutional neural networks have established significant success in solving various machine learning and computer vision problems. For skeleton-based action recognition, graph convolutional neural networks are the most suitable choice since human skeleton resembles to a graph. Stacking body skeletons over the length of video sequence results in a very complex spatio-temporal graph of many nodes and edges. Modeling the graph convolutional network directly with such a complex graph curtails the performance due to the redundancy of insignificant nodes and edges in the graph. Also for skeleton based action recognition, the long-term contextual information is of central importance and many current architectures may fail to capture such contextual information. Therefore in order to alleviate these problems, we propose graph sparsification technique using edge effective resistance to better model the global context information and to eliminate redundant nodes and edges in the graph. Furthermore, we incorporate self-attention graph pooling to retain local properties and graph structures while pooling operation. To the best of our knowledge, we are the first to apply graph sparsification using edge effective resistance for skeleton-based action recognition and our proposed method is confirmed to be effective on action recognition, which achieves state-of-the-art results on publicly available datasets: UTD-MHAD, J-HMDB, NTU-RGB + D-60, NTU-RGB + D-120 and Kinetics dataset. (C) 2020 Elsevier B.V. All rights reserved.

引用

页码：389 / 398

页数：10

共 52 条

[1] Human Action Recognition in Unconstrained Trimmed Videos Using Residual Attention Network and Joints Path Signature [J].

Ahmad, Tasweer ;

Jin, Lianwen ;

Feng, Jialuo ;

Tang, Guozhi .

IEEE ACCESS, 2019, 7 :121212-121222

[2]

Ahmad Z., IEEE SENS J

[3]

[Anonymous], ARXIV181208008

[4]

[Anonymous], IEEE T PATTERN ANAL

[5]

[Anonymous], ARXIV170506950

[6]

Chen C, 2015, IEEE IMAGE PROC, P168, DOI 10.1109/ICIP.2015.7350781

[7] Graph convolutional network with structure pooling and joint-wise channel attention for action recognition [J].

Chen, Yuxin ;

Ma, Gaoqun ;

Yuan, Chunfeng ;

Li, Bing ;

Zhang, Hui ;

Wang, Fangshi ;

Hu, Weiming .

PATTERN RECOGNITION, 2020, 103

[8] PoTion: Pose MoTion Representation for Action Recognition [J].

Choutas, Vasileios ;

Weinzaepfel, Philippe ;

Revaud, Jerome ;

Schmid, Cordelia .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7024-7033

[9] Sparsification - A technique for speeding up dynamic graph algorithms [J].

Eppstein, D ;

Galil, Z ;

Italiano, GF ;

Nissenzweig, A .

JOURNAL OF THE ACM, 1997, 44 (05) :669-696

[10]

Fey M., ARXIV190302428

← 1 2 3 4 5 6 →