Constructing Stronger and Faster Baselines for Skeleton-Based Action Recognition

被引：250

作者：

Song, Yi-Fan ^{[1
,2
]}

Zhang, Zhang ^{[1
,2
]}

Shan, Caifeng ^{[3
,4
]}

Wang, Liang ^{[1
,2
]}

机构：

[1] Univ Chinese Acad Sci UCAS, Sch Artificial Intelligence, Beijing 100190, Peoples R China

[2] Chinese Acad Sci CASIA, Inst Automat, Ctr Res Intelligent Percept & Comp CRIPAC, Natl Lab Pattern Recognit NLPR, Beijing 100190, Peoples R China

[3] Shandong Univ Sci & Technol SDUST, Coll Elect Engn & Automation, Qingdao 266590, Peoples R China

[4] Chinese Acad Sci CAS AIR, Artificial Intelligence Res, Beijing 100190, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2023年 / 45卷 / 02期

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

Action recognition; skeleton sequence; graph convolutional network; EfficientNet; separable convolution; PERSON REIDENTIFICATION;

D O I：

10.1109/TPAMI.2022.3157033

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

One essential problem in skeleton-based action recognition is how to extract discriminative features over all skeleton joints. However, the complexity of the recent State-Of-The-Art (SOTA) models for this task tends to be exceedingly sophisticated and over-parameterized. The low efficiency in model training and inference has increased the validation costs of model architectures in large-scale datasets. To address the above issue, recent advanced separable convolutional layers are embedded into an early fused Multiple Input Branches (MIB) network, constructing an efficient Graph Convolutional Network (GCN) baseline for skeleton-based action recognition. In addition, based on such the baseline, we design a compound scaling strategy to expand the model's width and depth synchronously, and eventually obtain a family of efficient GCN baselines with high accuracies and small amounts of trainable parameters, termed EfficientGCN-Bx, where "x " denotes the scaling coefficient. On two large-scale datasets, i.e., NTU RGB+D 60 and 120, the proposed EfficientGCN-B4 baseline outperforms other SOTA methods, e.g., achieving 92.1% accuracy on the cross-subject benchmark of NTU 60 dataset, while being 5.82x smaller and 5.85x faster than MS-G3D, which is one of the SOTA methods. The source code in PyTorch version and the pretrained models are available at https://github.com/yfsong0709/EfficientGCNv1.

引用

页码：1474 / 1488

页数：15

共 50 条

[31] Lighter and faster: A multi-scale adaptive graph convolutional network for skeleton-based action recognition
Jiang, Yuanjian
Deng, Hongmin
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 132
[32] Action Tree Convolutional Networks: Skeleton-Based Human Action Recognition
Liu, Wenjie
Zhang, Ziyi
Han, Bing
Zhu, Chenhui
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT III, 2018, 11166 : 783 - 792
[33] Transformer for Skeleton-based action recognition: A review of recent advances
Xin, Wentian
Liu, Ruyi
Liu, Yi
Chen, Yu
Yu, Wenxin
Miao, Qiguang
NEUROCOMPUTING, 2023, 537 : 164 - 186
[34] Skeleton-Based Action Recognition with Shift Graph Convolutional Network
Cheng, Ke
Zhang, Yifan
He, Xiangyu
Chen, Weihan
Cheng, Jian
Lu, Hanqing
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 180 - 189
[35] Skeleton-Based Action Recognition with Directed Graph Neural Networks
Shi, Lei
Zhang, Yifan
Cheng, Jian
Lu, Hanqing
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7904 - 7913
[36] HybridNet: Integrating GCN and CNN for skeleton-based action recognition
Wenjie Yang
Jianlin Zhang
Jingju Cai
Zhiyong Xu
Applied Intelligence, 2023, 53 : 574 - 585
[37] InfoGCN: Representation Learning for Human Skeleton-based Action Recognition
Chi, Hyung-gun
Ha, Myoung Hoon
Chi, Seunggeun
Lee, Sang Wan
Huang, Qixing
Ramani, Karthik
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 20154 - 20164
[38] A lightweight graph convolutional network for skeleton-based action recognition
Dinh-Tan Pham
Quang-Tien Pham
Tien-Thanh Nguyen
Thi-Lan Le
Hai Vu
Multimedia Tools and Applications, 2023, 82 : 3055 - 3079
[39] Fusion sampling networks for skeleton-based human action recognition
Chen, Guannan
Wei, Shimin
JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (05)
[40] Hierarchical Soft Quantization for Skeleton-Based Human Action Recognition
Yang, Jianyu
Liu, Wu
Yuan, Junsong
Mei, Tao
IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 883 - 898

← 1 2 3 4 5 →