Skeleton-based action recognition with multi-stream, multi-scale dilated spatial-temporal graph convolution network

被引：0

作者：

Haiping Zhang

Xu Liu

Dongjin Yu

Liming Guan

Dongjing Wang

Conghao Ma

Zepeng Hu

机构：

[1] Hangzhou Dianzi University,School of Computer Science

[2] Hangzhou Dianzi University,School of Information Engineering

[3] Hangzhou Dianzi University,School of Electronics and Information

来源：

Applied Intelligence | 2023年 / 53卷

关键词：

Action recognition; Skeleton; GCN; Multi-stream network;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Action recognition techniques based on skeleton data are receiving more and more attention in the field of computer vision due to their ability to adapt to dynamic environments and complex backgrounds. Topologizing human skeleton data as spatial-temporal graphs and processing them using graph convolutional networks (GCNs) has been shown to produce good recognition results. However, with existing GCN methods, a fixed-size convolution kernel is often used to extract time-domain features, which may not be very suitable for multi-level model structures. Equal proportion fusion of different streams in a multi-stream network may ignore the difference in recognition ability of different streams, and these will affect the final recognition result. In this paper, we are proposing (1) a multi-scale dilated temporal graph convolution layer (MDTGCL) and (2) a multi-branch feature fusion (MFF) structure. The MDTGCL utilizes multiple convolution kernels and dilated convolution to better adapt to the multi-layer structure of the GCN model and to obtain longer periods of contextual spatial-temporal information, resulting in richer behavioural features. MFF entails weighted fusion based on the results of multi-stream outputs, and this is used to obtain the final recognition results. As higher-order skeleton data are highly discriminative and more conducive to human action recognition, we used spatial information on joints and bones and their multiple motion, as well as angle information pertaining to bones, to model together in this study. By combining the above, we designed a multi-stream, multi-scale dilated spatial-temporal graph convolutional network (2M-STGCN) model and conducted extensive experiments with two large datasets (NTU RGB+D 60 and Kinetics Skeleton 400), which showed that our model performs at SOTA level.

引用

页码：17629 / 17643

页数：14

共 50 条

[31] Multi-stream P&U adaptive graph convolutional networks for skeleton-based action recognition
Chen, Minglong
Liang, Jiuzhen
Liu, Hao
JOURNAL OF SUPERCOMPUTING, 2024, 80 (08) : 11614 - 11639
[32] Multi-stream P&U adaptive graph convolutional networks for skeleton-based action recognition
Minglong Chen
Jiuzhen Liang
Hao Liu
The Journal of Supercomputing, 2024, 80 : 11614 - 11639
[33] Spatial Graph Convolutional and Temporal Involution Network for Skeleton-based Action Recognition
Wan, Huifan
Pan, Guanghui
Chen, Yu
Ding, Danni
Zou, Maoyang
PROCEEDINGS OF ACM TURING AWARD CELEBRATION CONFERENCE, ACM TURC 2021, 2021, : 204 - 209
[34] Spatial adaptive graph convolutional network for skeleton-based action recognition
Zhu, Qilin
Deng, Hongmin
APPLIED INTELLIGENCE, 2023, 53 (14) : 17796 - 17808
[35] Dual-Stream Structured Graph Convolution Network for Skeleton-Based Action Recognition
Xu, Chunyan
Liu, Rong
Zhang, Tong
Cui, Zhen
Yang, Jian
Hu, Chunlong
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (04)
[36] Spatial adaptive graph convolutional network for skeleton-based action recognition
Qilin Zhu
Hongmin Deng
Applied Intelligence, 2023, 53 : 17796 - 17808
[37] Skeleton-based action recognition with temporal action graph and temporal adaptive graph convolution structure
Cao, Yi
Liu, Chen
Huang, Zilong
Sheng, Yongjian
Ju, Yongjian
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (19) : 29139 - 29162
[38] Skeleton-based action recognition with temporal action graph and temporal adaptive graph convolution structure
Yi Cao
Chen Liu
Zilong Huang
Yongjian Sheng
Yongjian Ju
Multimedia Tools and Applications, 2021, 80 : 29139 - 29162
[39] PART AWARE GRAPH CONVOLUTION NETWORK WITH TEMPORAL ENHANCEMENT FOR SKELETON-BASED ACTION RECOGNITION
Huang, Qian
Nie, Yunqing
Li, Xing
Yang, Tianjin
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 3255 - 3259
[40] Spatial–Temporal gated graph attention network for skeleton-based action recognition
Mrugendrasinh Rahevar
Amit Ganatra
Pattern Analysis and Applications, 2023, 26 (3) : 929 - 939

← 1 2 3 4 5 →