Human Skeleton Feature Optimizer and Adaptive Structure Enhancement Graph Convolution Network for Action Recognition

被引：19

作者：

Xiong, Xin ^{[1
,2
,3
]}

Min, Weidong ^{[3
,4
]}

Wang, Qi ^{[5
]}

Zha, Cheng ^{[6
]}

机构：

[1] Nanchang Univ, Affiliated Hosp 1, Informat Dept, Nanchang, Peoples R China

[2] Nanchang Univ, Inst Metaverse, Nanchang 330031, Peoples R China

[3] Jiangxi Key Lab Smart City, Nanchang, Peoples R China

[4] Nanchang Univ, Inst Metaverse, Sch Math & Comp Sci, Nanchang 330047, Peoples R China

[5] Nanchang Univ, Sch Software, Nanchang 330047, Peoples R China

[6] Nanchang Univ, Sch Math & Comp Sci, Nanchang 330031, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2023年 / 33卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Feature extraction; Skeleton; Convolution; Data mining; Directed graphs; Smart cities; Kernel; Action recognition; graph convolution network; skeleton feature optimizer; graph structure mask; directed graph mapping; adaptive pooling operation; KNOWLEDGE DISTILLATION; INTERNET;

D O I：

10.1109/TCSVT.2022.3201186

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Human action recognition based on the graph convolution network (GCN) is a hot topic in computer vision. Existing GCN-based methods fail to capture internal implicit information when extracting action features, thereby leading to over-smoothing in the training stage. These issues result in poor performance and inaccurate extraction of action features. To address these problems, a new GCN is constructed. In this paper, a human skeleton feature optimizer (SFO) and adaptive structure enhancement graph convolution network (ASE-GCN) for action recognition are proposed in an end-to-end manner. To obtain discriminative features, the SFO is proposed to construct a new skeleton representation for action recognition through the connection criterion, which extracts the internal implicit information of action. The action feature of the joint coordinates is extracted by graph structure mask (GSM), directed graph mapping (DGM), and adaptive pooling operation (APO) in the proposed ASE-GCN network. The GSM acts as the regularizer of skeleton structure information to strengthen the representation of the graph structure. The DGM correlates the directed graph with human motion information through kinematic principle, and the APO strengthens the global high-frequency features to alleviate over-smoothing. The proposed method achieves comparable or superior results over state-of-the-art methods when used in experiments on two large public-scale datasets, NTU-RGB+D and Kinetics.

引用

页码：342 / 353

页数：12

共 72 条

[1] Structural Knowledge Distillation for Efficient Skeleton-Based Action Recognition
Bian, Cunling
Feng, Wei
Wan, Liang
Wang, Song
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 2963 - 2976
[2] The recognition of human movement using temporal templates
Bobick, AF
Davis, JW
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (03) : 257 - 267
[3] Bruna J., 2014, P INT C LEARN REPR I, P1
[4] Busbridge D, 2019, Arxiv, DOI [arXiv:1904.05811, DOI 10.48550/ARXIV.1904.05811]
[5] JOLO-GCN: Mining Joint-Centered Light-Weight Information for Skeleton-Based Action Recognition
Cai, Jinmiao
Jiang, Nianjuan
Han, Xiaoguang
Jia, Kui
Lu, Jiangbo
[J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 2734 - 2743
[6] Desktop Action Recognition From First-Person Point-of-View
Cai, Minjie
Lu, Feng
Gao, Yue
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (05) : 1616 - 1628
[7] Skeleton-Based Action Recognition With Gated Convolutional Neural Networks
Cao, Congqi
Lan, Cuiling
Zhang, Yifan
Zeng, Wenjun
Lu, Hanqing
Zhang, Yanning
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (11) : 3247 - 3257
[8] Person Search by Separated Modeling and A Mask-Guided Two-Stream CNN Model
Chen, Di
Zhang, Shanshan
Ouyang, Wanli
Yang, Jian
Tai, Ying
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 (29) : 4669 - 4682
[9] Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition
Chen, Tailin
Zhou, Desen
Wang, Jian
Wang, Shidong
Guan, Yu
He, Xuming
Ding, Errui
[J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4334 - 4342
[10] Chen Y., 2021, Proceedings of the IEEE/CVF International Conference on Computer Vision, P13359

← 1 2 3 4 5 6 7 8 →