Joints-Centered Spatial-Temporal Features Fused Skeleton Convolution Network for Action Recognition

被引：2

作者：

Song, Wenfeng ^{[1
]}

Chu, Tangli ^{[2
]}

Li, Shuai ^{[3
]}

Li, Nannan ^{[5
]}

Hao, Aimin ^{[2
,4
]}

Qin, Hong ^{[6
]}

机构：

[1] Beijing Informat Sci & Technol Univ, Comp Sch, Beijing 100101, Peoples R China

[2] Beihang Univ, State Key Lab Virtual Real Technol & Syst, Beijing 100191, Peoples R China

[3] Zhongguancun Lab, Beijing, Peoples R China

[4] Chinese Acad Med Sci, Res Unit Virtual Body & Virtual Surg Technol, 2019RU004, Beijing, Peoples R China

[5] Dalian Maritime Univ, Sch Informat Sci & Technol, Dalian 116024, Peoples R China

[6] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY 11794 USA

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2024年 / 26卷

基金：

中国国家自然科学基金;

关键词：

Skeleton; Feature extraction; Convolution; Visualization; Task analysis; Joints; Data mining; Skeleton-based action recognition; spatial-temporal feature fusion; PDE diffusion; NEURAL-NETWORKS; GRAPH; REPRESENTATION; DESCRIPTOR; DIFFUSION; FUSION;

D O I：

10.1109/TMM.2023.3324835

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Skeleton-based action recognition is crucial for natural human-computer interaction, dynamic behavior analysis, and behavior surveillance. The key challenge is to effectively capture the intrinsic local-global clues of the activity. However, it remains challenging to efficiently leverage multidimensional information related to joints' local visual appearances, global spatial relationships, and coherent temporal cues. To address this challenge, we propose a joints-centered spatial-temporal feature-fused framework for action recognition, which exploits skeleton-based graph diffusion and convolution. Specifically, we employ Partial Differential Equation (PDE) based skeleton graph diffusion to automatically activate and diffuse the salient appearance features of joints. This approach simultaneously integrates the joints' appearance clues and their hierarchical relationships at both the super-pixel level and structure level. The diffused appearance-related features of the joints are further fused with skeleton-related spatial-temporal features, and the resulting fused features are fed into a skeleton convolution network for action recognition. Our method was extensively evaluated on two public datasets (NTU-RGBD and UWA3D), and the results demonstrate the improved accuracy and effectiveness of our approach. Our code will be public.

引用

页码：4602 / 4616

页数：15

共 50 条

[1] Multi-Stream and Enhanced Spatial-Temporal Graph Convolution Network for Skeleton-Based Action Recognition
Li, Fanjia
Zhu, Aichun
Xu, Yonggang
Cui, Ran
Hua, Gang
IEEE ACCESS, 2020, 8 : 97757 - 97770
[2] Learning Heterogeneous Spatial-Temporal Context for Skeleton-Based Action Recognition
Gao, Xuehao
Yang, Yang
Wu, Yang
Du, Shaoyi
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (09) : 12130 - 12141
[3] Dynamic Semantic-Based Spatial-Temporal Graph Convolution Network for Skeleton-Based Human Action Recognition
Xie, Jianyang
Meng, Yanda
Zhao, Yitian
Nguyen, Anh
Yang, Xiaoyun
Zheng, Yalin
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 6691 - 6704
[4] Lightweight Long and Short-Range Spatial-Temporal Graph Convolutional Network for Skeleton-Based Action Recognition
Chen, Hongbo
Li, Menglei
Jing, Lei
Cheng, Zixue
IEEE ACCESS, 2021, 9 : 161374 - 161382
[5] Spatial-Temporal Dynamic Graph Attention Network for Skeleton-Based Action Recognition
Rahevar, Mrugendrasinh
Ganatra, Amit
Saba, Tanzila
Rehman, Amjad
Bahaj, Saeed Ali
IEEE ACCESS, 2023, 11 : 21546 - 21553
[6] Spatial Temporal Graph Deconvolutional Network for Skeleton-Based Human Action Recognition
Peng, Wei
Shi, Jingang
Zhao, Guoying
IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 244 - 248
[7] Dynamic Spatial-temporal Hypergraph Convolutional Network for Skeleton-based Action Recognition
Wang, Shengqin
Zhang, Yongji
Qi, Hong
Zhao, Minghao
Jiang, Yu
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2147 - 2152
[8] ASMGCN: Attention-Based Semantic-Guided Multistream Graph Convolution Network for Skeleton Action Recognition
Zhang, Moyan
Quan, Zhenzhen
Wang, Wei
Chen, Zhe
Guo, Xiaoshan
Li, Yujun
IEEE SENSORS JOURNAL, 2024, 24 (12) : 20064 - 20075
[9] TranSkeleton: Hierarchical Spatial-Temporal Transformer for Skeleton-Based Action Recognition
Liu, Haowei
Liu, Yongcheng
Chen, Yuxin
Yuan, Chunfeng
Li, Bing
Hu, Weiming
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (08) : 4137 - 4148
[10] Skeleton-based action recognition with multi-stream, multi-scale dilated spatial-temporal graph convolution network
Zhang, Haiping
Liu, Xu
Yu, Dongjin
Guan, Liming
Wang, Dongjing
Ma, Conghao
Hu, Zepeng
APPLIED INTELLIGENCE, 2023, 53 (14) : 17629 - 17643

← 1 2 3 4 5 →