Constructing Stronger and Faster Baselines for Skeleton-Based Action Recognition

被引:250
|
作者
Song, Yi-Fan [1 ,2 ]
Zhang, Zhang [1 ,2 ]
Shan, Caifeng [3 ,4 ]
Wang, Liang [1 ,2 ]
机构
[1] Univ Chinese Acad Sci UCAS, Sch Artificial Intelligence, Beijing 100190, Peoples R China
[2] Chinese Acad Sci CASIA, Inst Automat, Ctr Res Intelligent Percept & Comp CRIPAC, Natl Lab Pattern Recognit NLPR, Beijing 100190, Peoples R China
[3] Shandong Univ Sci & Technol SDUST, Coll Elect Engn & Automation, Qingdao 266590, Peoples R China
[4] Chinese Acad Sci CAS AIR, Artificial Intelligence Res, Beijing 100190, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Action recognition; skeleton sequence; graph convolutional network; EfficientNet; separable convolution; PERSON REIDENTIFICATION;
D O I
10.1109/TPAMI.2022.3157033
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One essential problem in skeleton-based action recognition is how to extract discriminative features over all skeleton joints. However, the complexity of the recent State-Of-The-Art (SOTA) models for this task tends to be exceedingly sophisticated and over-parameterized. The low efficiency in model training and inference has increased the validation costs of model architectures in large-scale datasets. To address the above issue, recent advanced separable convolutional layers are embedded into an early fused Multiple Input Branches (MIB) network, constructing an efficient Graph Convolutional Network (GCN) baseline for skeleton-based action recognition. In addition, based on such the baseline, we design a compound scaling strategy to expand the model's width and depth synchronously, and eventually obtain a family of efficient GCN baselines with high accuracies and small amounts of trainable parameters, termed EfficientGCN-Bx, where "x " denotes the scaling coefficient. On two large-scale datasets, i.e., NTU RGB+D 60 and 120, the proposed EfficientGCN-B4 baseline outperforms other SOTA methods, e.g., achieving 92.1% accuracy on the cross-subject benchmark of NTU 60 dataset, while being 5.82x smaller and 5.85x faster than MS-G3D, which is one of the SOTA methods. The source code in PyTorch version and the pretrained models are available at https://github.com/yfsong0709/EfficientGCNv1.
引用
收藏
页码:1474 / 1488
页数:15
相关论文
共 50 条
  • [31] Lighter and faster: A multi-scale adaptive graph convolutional network for skeleton-based action recognition
    Jiang, Yuanjian
    Deng, Hongmin
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 132
  • [32] Action Tree Convolutional Networks: Skeleton-Based Human Action Recognition
    Liu, Wenjie
    Zhang, Ziyi
    Han, Bing
    Zhu, Chenhui
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT III, 2018, 11166 : 783 - 792
  • [33] Transformer for Skeleton-based action recognition: A review of recent advances
    Xin, Wentian
    Liu, Ruyi
    Liu, Yi
    Chen, Yu
    Yu, Wenxin
    Miao, Qiguang
    NEUROCOMPUTING, 2023, 537 : 164 - 186
  • [34] Skeleton-Based Action Recognition with Shift Graph Convolutional Network
    Cheng, Ke
    Zhang, Yifan
    He, Xiangyu
    Chen, Weihan
    Cheng, Jian
    Lu, Hanqing
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 180 - 189
  • [35] Skeleton-Based Action Recognition with Directed Graph Neural Networks
    Shi, Lei
    Zhang, Yifan
    Cheng, Jian
    Lu, Hanqing
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7904 - 7913
  • [36] HybridNet: Integrating GCN and CNN for skeleton-based action recognition
    Wenjie Yang
    Jianlin Zhang
    Jingju Cai
    Zhiyong Xu
    Applied Intelligence, 2023, 53 : 574 - 585
  • [37] InfoGCN: Representation Learning for Human Skeleton-based Action Recognition
    Chi, Hyung-gun
    Ha, Myoung Hoon
    Chi, Seunggeun
    Lee, Sang Wan
    Huang, Qixing
    Ramani, Karthik
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 20154 - 20164
  • [38] A lightweight graph convolutional network for skeleton-based action recognition
    Dinh-Tan Pham
    Quang-Tien Pham
    Tien-Thanh Nguyen
    Thi-Lan Le
    Hai Vu
    Multimedia Tools and Applications, 2023, 82 : 3055 - 3079
  • [39] Fusion sampling networks for skeleton-based human action recognition
    Chen, Guannan
    Wei, Shimin
    JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (05)
  • [40] Hierarchical Soft Quantization for Skeleton-Based Human Action Recognition
    Yang, Jianyu
    Liu, Wu
    Yuan, Junsong
    Mei, Tao
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 883 - 898