Skeleton-based action recognition with multi-stream, multi-scale dilated spatial-temporal graph convolution network

被引:0
|
作者
Haiping Zhang
Xu Liu
Dongjin Yu
Liming Guan
Dongjing Wang
Conghao Ma
Zepeng Hu
机构
[1] Hangzhou Dianzi University,School of Computer Science
[2] Hangzhou Dianzi University,School of Information Engineering
[3] Hangzhou Dianzi University,School of Electronics and Information
来源
Applied Intelligence | 2023年 / 53卷
关键词
Action recognition; Skeleton; GCN; Multi-stream network;
D O I
暂无
中图分类号
学科分类号
摘要
Action recognition techniques based on skeleton data are receiving more and more attention in the field of computer vision due to their ability to adapt to dynamic environments and complex backgrounds. Topologizing human skeleton data as spatial-temporal graphs and processing them using graph convolutional networks (GCNs) has been shown to produce good recognition results. However, with existing GCN methods, a fixed-size convolution kernel is often used to extract time-domain features, which may not be very suitable for multi-level model structures. Equal proportion fusion of different streams in a multi-stream network may ignore the difference in recognition ability of different streams, and these will affect the final recognition result. In this paper, we are proposing (1) a multi-scale dilated temporal graph convolution layer (MDTGCL) and (2) a multi-branch feature fusion (MFF) structure. The MDTGCL utilizes multiple convolution kernels and dilated convolution to better adapt to the multi-layer structure of the GCN model and to obtain longer periods of contextual spatial-temporal information, resulting in richer behavioural features. MFF entails weighted fusion based on the results of multi-stream outputs, and this is used to obtain the final recognition results. As higher-order skeleton data are highly discriminative and more conducive to human action recognition, we used spatial information on joints and bones and their multiple motion, as well as angle information pertaining to bones, to model together in this study. By combining the above, we designed a multi-stream, multi-scale dilated spatial-temporal graph convolutional network (2M-STGCN) model and conducted extensive experiments with two large datasets (NTU RGB+D 60 and Kinetics Skeleton 400), which showed that our model performs at SOTA level.
引用
收藏
页码:17629 / 17643
页数:14
相关论文
共 50 条
  • [31] Multi-stream P&U adaptive graph convolutional networks for skeleton-based action recognition
    Chen, Minglong
    Liang, Jiuzhen
    Liu, Hao
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (08) : 11614 - 11639
  • [32] Multi-stream P&U adaptive graph convolutional networks for skeleton-based action recognition
    Minglong Chen
    Jiuzhen Liang
    Hao Liu
    The Journal of Supercomputing, 2024, 80 : 11614 - 11639
  • [33] Spatial Graph Convolutional and Temporal Involution Network for Skeleton-based Action Recognition
    Wan, Huifan
    Pan, Guanghui
    Chen, Yu
    Ding, Danni
    Zou, Maoyang
    PROCEEDINGS OF ACM TURING AWARD CELEBRATION CONFERENCE, ACM TURC 2021, 2021, : 204 - 209
  • [34] Spatial adaptive graph convolutional network for skeleton-based action recognition
    Zhu, Qilin
    Deng, Hongmin
    APPLIED INTELLIGENCE, 2023, 53 (14) : 17796 - 17808
  • [35] Dual-Stream Structured Graph Convolution Network for Skeleton-Based Action Recognition
    Xu, Chunyan
    Liu, Rong
    Zhang, Tong
    Cui, Zhen
    Yang, Jian
    Hu, Chunlong
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (04)
  • [36] Spatial adaptive graph convolutional network for skeleton-based action recognition
    Qilin Zhu
    Hongmin Deng
    Applied Intelligence, 2023, 53 : 17796 - 17808
  • [37] Skeleton-based action recognition with temporal action graph and temporal adaptive graph convolution structure
    Cao, Yi
    Liu, Chen
    Huang, Zilong
    Sheng, Yongjian
    Ju, Yongjian
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (19) : 29139 - 29162
  • [38] Skeleton-based action recognition with temporal action graph and temporal adaptive graph convolution structure
    Yi Cao
    Chen Liu
    Zilong Huang
    Yongjian Sheng
    Yongjian Ju
    Multimedia Tools and Applications, 2021, 80 : 29139 - 29162
  • [39] PART AWARE GRAPH CONVOLUTION NETWORK WITH TEMPORAL ENHANCEMENT FOR SKELETON-BASED ACTION RECOGNITION
    Huang, Qian
    Nie, Yunqing
    Li, Xing
    Yang, Tianjin
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 3255 - 3259
  • [40] Spatial–Temporal gated graph attention network for skeleton-based action recognition
    Mrugendrasinh Rahevar
    Amit Ganatra
    Pattern Analysis and Applications, 2023, 26 (3) : 929 - 939