Multi-stream Global-Local Motion Fusion Network for skeleton-based action recognition

被引:0
|
作者
Qi, Yanpeng [1 ]
Pang, Chen [1 ]
Liu, Yiliang [1 ,3 ]
Lyu, Lei [1 ,2 ]
机构
[1] Shandong Normal Univ, Sch Informat Sci & Engn, Jinan, Peoples R China
[2] Shandong Prov Key Lab Distributed Comp Software No, Jinan, Peoples R China
[3] Shandong Prov Acad Educ Recruitment & Examinat, Jinan, Peoples R China
关键词
Action recognition; Grouping graph convolution; Spatial-temporal self-attention; Multi-stream fusion strategy; LSTM;
D O I
10.1016/j.asoc.2023.110536
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Skeleton-based action recognition is widely used in varied areas such as human-machine interaction and virtual reality. Benefit from the powerful expression ability to depict structural data, graph convolutional networks (GCNs) have been developed to address this task by modeling the human body skeletons as spatial-temporal graphs. However, most existing GCN-based methods usually ignore the diversity of the motion information between channels of the input feature. And how to enhance the ability to capture the long-term global correlations in spatial and temporal dimensions is also a fundamental challenge. In this work, we propose a novel multi-stream framework Global-Local Motion Fusion Network (GLMFN), which integrates the global and local motion information of spatial-temporal dimensions. Specifically, we design a grouping graph convolution module to enforce the ability to aggregate local spatial motion information. Besides, to learn richer semantic features, we propose two modules based on the self-attention operator: a spatial self-attention module and a temporal self-attention module. The former is responsible for extracting spatial long-term motion relationships, while the latter aims to capture temporal long-term motion relationships. Moreover, we present a multi-stream fusion strategy with a series of treatments for body joints to achieve a better recognition effect. To validate the efficacy and efficiency of the proposed model, we perform exhaustive experiments on the NTU-RGBD dataset and NTU-RGBD-120 dataset, and our method achieves the state-of-the-art performance on both datasets. (c) 2023 Published by Elsevier B.V.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Multi-stream adaptive spatial-temporal attention graph convolutional network for skeleton-based action recognition
    Yu, Lubin
    Tian, Lianfang
    Du, Qiliang
    Bhutto, Jameel Ahmed
    IET COMPUTER VISION, 2022, 16 (02) : 143 - 158
  • [22] Skeleton-based action recognition with multi-stream, multi-scale dilated spatial-temporal graph convolution network
    Zhang, Haiping
    Liu, Xu
    Yu, Dongjin
    Guan, Liming
    Wang, Dongjing
    Ma, Conghao
    Hu, Zepeng
    APPLIED INTELLIGENCE, 2023, 53 (14) : 17629 - 17643
  • [23] DHF-SLR: Dual-Hand Multi-Stream Fusion Network for Skeleton-Based Sign Language Recognition
    Zhang, Meiqi
    Gao, Qing
    Ju, Zhaojie
    2024 INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS, ICARM 2024, 2024, : 649 - 654
  • [24] Skeleton-based action recognition with multi-stream, multi-scale dilated spatial-temporal graph convolution network
    Haiping Zhang
    Xu Liu
    Dongjin Yu
    Liming Guan
    Dongjing Wang
    Conghao Ma
    Zepeng Hu
    Applied Intelligence, 2023, 53 : 17629 - 17643
  • [25] A Spatiotemporal Fusion Network For Skeleton-Based Action Recognition
    Bao, Wenxia
    Wang, Junyi
    Yang, Xianjun
    Chen, Hemu
    2024 3RD INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND MEDIA COMPUTING, ICIPMC 2024, 2024, : 347 - 352
  • [26] Advancing skeleton-based human behavior recognition: multi-stream fusion spatiotemporal graph convolutional networks
    Liu, Fenglin
    Wang, Chenyu
    Tian, Zhiqiang
    Du, Shaoyi
    Zeng, Wei
    COMPLEX & INTELLIGENT SYSTEMS, 2025, 11 (01)
  • [27] Global-local contrastive multiview representation learning for skeleton-based action
    Bian, Cunling
    Feng, Wei
    Meng, Fanbo
    Wang, Song
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 229
  • [28] Viewpoint guided multi-stream neural network for skeleton action recognition
    He, Yicheng
    Liang, Zixi
    He, Shaocong
    Wang, Yonghua
    Yin, Ming
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (03) : 6783 - 6802
  • [29] Viewpoint guided multi-stream neural network for skeleton action recognition
    Yicheng He
    Zixi Liang
    Shaocong He
    Yonghua Wang
    Ming Yin
    Multimedia Tools and Applications, 2024, 83 : 6783 - 6802
  • [30] Multi-stream P&U adaptive graph convolutional networks for skeleton-based action recognition
    Chen, Minglong
    Liang, Jiuzhen
    Liu, Hao
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (08): : 11614 - 11639