Joints-Centered Spatial-Temporal Features Fused Skeleton Convolution Network for Action Recognition

被引:2
|
作者
Song, Wenfeng [1 ]
Chu, Tangli [2 ]
Li, Shuai [3 ]
Li, Nannan [5 ]
Hao, Aimin [2 ,4 ]
Qin, Hong [6 ]
机构
[1] Beijing Informat Sci & Technol Univ, Comp Sch, Beijing 100101, Peoples R China
[2] Beihang Univ, State Key Lab Virtual Real Technol & Syst, Beijing 100191, Peoples R China
[3] Zhongguancun Lab, Beijing, Peoples R China
[4] Chinese Acad Med Sci, Res Unit Virtual Body & Virtual Surg Technol, 2019RU004, Beijing, Peoples R China
[5] Dalian Maritime Univ, Sch Informat Sci & Technol, Dalian 116024, Peoples R China
[6] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY 11794 USA
基金
中国国家自然科学基金;
关键词
Skeleton; Feature extraction; Convolution; Visualization; Task analysis; Joints; Data mining; Skeleton-based action recognition; spatial-temporal feature fusion; PDE diffusion; NEURAL-NETWORKS; GRAPH; REPRESENTATION; DESCRIPTOR; DIFFUSION; FUSION;
D O I
10.1109/TMM.2023.3324835
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Skeleton-based action recognition is crucial for natural human-computer interaction, dynamic behavior analysis, and behavior surveillance. The key challenge is to effectively capture the intrinsic local-global clues of the activity. However, it remains challenging to efficiently leverage multidimensional information related to joints' local visual appearances, global spatial relationships, and coherent temporal cues. To address this challenge, we propose a joints-centered spatial-temporal feature-fused framework for action recognition, which exploits skeleton-based graph diffusion and convolution. Specifically, we employ Partial Differential Equation (PDE) based skeleton graph diffusion to automatically activate and diffuse the salient appearance features of joints. This approach simultaneously integrates the joints' appearance clues and their hierarchical relationships at both the super-pixel level and structure level. The diffused appearance-related features of the joints are further fused with skeleton-related spatial-temporal features, and the resulting fused features are fed into a skeleton convolution network for action recognition. Our method was extensively evaluated on two public datasets (NTU-RGBD and UWA3D), and the results demonstrate the improved accuracy and effectiveness of our approach. Our code will be public.
引用
收藏
页码:4602 / 4616
页数:15
相关论文
共 50 条
  • [1] Multi-Stream and Enhanced Spatial-Temporal Graph Convolution Network for Skeleton-Based Action Recognition
    Li, Fanjia
    Zhu, Aichun
    Xu, Yonggang
    Cui, Ran
    Hua, Gang
    IEEE ACCESS, 2020, 8 : 97757 - 97770
  • [2] Skeleton Driven Action Recognition Using an Image-Based Spatial-Temporal Representation and Convolution Neural Network
    Silva, Vinicius
    Soares, Filomena
    Leao, Celina P.
    Esteves, Joao Sena
    Vercelli, Gianni
    SENSORS, 2021, 21 (13)
  • [3] Multi-Branch Spatial-Temporal Attention Graph Convolution Network for Skeleton-based Action Recognition
    Wang, Daoshuai
    Li, Dewei
    Guan, Yaonan
    Wang, Gang
    Shao, Haibin
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 6487 - 6492
  • [4] Dynamic Semantic-Based Spatial-Temporal Graph Convolution Network for Skeleton-Based Human Action Recognition
    Xie, Jianyang
    Meng, Yanda
    Zhao, Yitian
    Nguyen, Anh
    Yang, Xiaoyun
    Zheng, Yalin
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 6691 - 6704
  • [5] Multilevel Spatial-Temporal Excited Graph Network for Skeleton-Based Action Recognition
    Zhu, Yisheng
    Shuai, Hui
    Liu, Guangcan
    Liu, Qingshan
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 496 - 508
  • [6] Spatial-Temporal Dynamic Graph Attention Network for Skeleton-Based Action Recognition
    Rahevar, Mrugendrasinh
    Ganatra, Amit
    Saba, Tanzila
    Rehman, Amjad
    Bahaj, Saeed Ali
    IEEE ACCESS, 2023, 11 : 21546 - 21553
  • [7] Dynamic Spatial-temporal Hypergraph Convolutional Network for Skeleton-based Action Recognition
    Wang, Shengqin
    Zhang, Yongji
    Qi, Hong
    Zhao, Minghao
    Jiang, Yu
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2147 - 2152
  • [8] Spatial-temporal slowfast graph convolutional network for skeleton-based action recognition
    Fang, Zheng
    Zhang, Xiongwei
    Cao, Tieyong
    Zheng, Yunfei
    Sun, Meng
    IET COMPUTER VISION, 2022, 16 (03) : 205 - 217
  • [9] Spatial-Temporal gated graph attention network for skeleton-based action recognition
    Rahevar, Mrugendrasinh
    Ganatra, Amit
    PATTERN ANALYSIS AND APPLICATIONS, 2023, 26 (03) : 929 - 939
  • [10] Spatial-Temporal Adaptive Graph Convolutional Network for Skeleton-Based Action Recognition
    Hang, Rui
    Li, MinXian
    COMPUTER VISION - ACCV 2022, PT IV, 2023, 13844 : 172 - 188