Dstsa-gcn: Advancing skeleton-based gesture recognition with semantic-aware spatio-temporal topology modeling

被引:1
作者
Cui, Hu [1 ]
Huang, Renjing [2 ]
Zhang, Ruoyu [3 ]
Hayama, Tessai [1 ]
机构
[1] Nagaoka Univ Technol, 1603-1 Kamitiomioka, Nagaoka, Niigata 9402188, Japan
[2] Guizhou Elect Technol Coll, Guiyang 550025, Guizhou, Peoples R China
[3] Guizhou Univ, Guiyang 550025, Guizhou, Peoples R China
关键词
Human action recognition; Gesture recognition; Graph convolution networks; Spatial-temporal model; NEURAL-NETWORK;
D O I
10.1016/j.neucom.2025.130066
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Graph convolutional networks (GCNs) have emerged as a powerful tool for skeleton-based action and gesture recognition, thanks to their ability to model spatial and temporal dependencies in skeleton data. However, existing GCN-based methods face critical limitations: (1) they lack effective spatio-temporal topology modeling that captures dynamic variations in skeletal motion, and (2) they struggle to model multiscale structural relationships beyond local joint connectivity. To address these issues, we propose a novel framework called Dynamic Spatial-Temporal Semantic Awareness Graph Convolutional Network (DSTSA-GCN). DSTSA-GCN introduces three key modules: Group Channel-wise Graph Convolution (GC-GC), Group Temporal-wise Graph Convolution (GT-GC), and Multi-Scale Temporal Convolution (MS-TCN). GC-GC and GT-GC operate in parallel to independently model channel-specific and frame-specific correlations, enabling robust topology learning that accounts for temporal variations. Additionally, both modules employ a grouping strategy to adaptively capture multiscale structural relationships. Complementing this, MS-TCN enhances temporal modeling through group-wise temporal convolutions with diverse receptive fields. Extensive experiments demonstrate that DSTSA-GCN significantly improves the topology modeling capabilities of GCNs, achieving state-of-the-art performance on benchmark datasets for gesture and action recognition, including SHREC'17 Track, DHG-14/28, NTU-RGB+D, NTU-RGB+D-120 and NW-ULCA. The code will be publicly available https://hucui2022.github.io/dstsa_gcn/.
引用
收藏
页数:14
相关论文
共 61 条
[1]  
Alemi A. A., 2017, P INT C LEARN REPR, DOI DOI 10.48550/ARXIV.1612.00410
[2]  
Bo Deyu, 2023, arXiv
[3]   Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition [J].
Chen, Tailin ;
Zhou, Desen ;
Wang, Jian ;
Wang, Shidong ;
Guan, Yu ;
He, Xuming ;
Ding, Errui .
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, :4334-4342
[4]   Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition [J].
Chen, Yuxin ;
Zhang, Ziqi ;
Yuan, Chunfeng ;
Li, Bing ;
Deng, Ying ;
Hu, Weiming .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :13339-13348
[5]   Decoupling GCN with DropGraph Module for Skeleton-Based Action Recognition [J].
Cheng, Ke ;
Zhang, Yifan ;
Cao, Congqi ;
Shi, Lei ;
Cheng, Jian ;
Lu, Hanqing .
COMPUTER VISION - ECCV 2020, PT XXIV, 2020, 12369 :536-553
[6]   Skeleton-Based Action Recognition with Shift Graph Convolutional Network [J].
Cheng, Ke ;
Zhang, Yifan ;
He, Xiangyu ;
Chen, Weihan ;
Cheng, Jian ;
Lu, Hanqing .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :180-189
[7]   InfoGCN: Representation Learning for Human Skeleton-based Action Recognition [J].
Chi, Hyung-gun ;
Ha, Myoung Hoon ;
Chi, Seunggeun ;
Lee, Sang Wan ;
Huang, Qixing ;
Ramani, Karthik .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :20154-20164
[8]   Joint-Partition Group Attention for skeleton-based action recognition [J].
Cui, Hu ;
Hayama, Tessai .
SIGNAL PROCESSING, 2024, 224
[9]   Spatial Graph Convolutional Networks [J].
Danel, Tomasz ;
Spurek, Przemyslaw ;
Tabor, Jacek ;
Smieja, Marek ;
Struski, Lukasz ;
Slowik, Agnieszka ;
Maziarka, Lukasz .
NEURAL INFORMATION PROCESSING, ICONIP 2020, PT V, 2021, 1333 :668-675
[10]  
De Smedt Q., 2017, P WORKSH 3D OBJ RETR, P33, DOI DOI 10.2312/3DOR.20171049