Constructing Stronger and Faster Baselines for Skeleton-Based Action Recognition

被引:250
|
作者
Song, Yi-Fan [1 ,2 ]
Zhang, Zhang [1 ,2 ]
Shan, Caifeng [3 ,4 ]
Wang, Liang [1 ,2 ]
机构
[1] Univ Chinese Acad Sci UCAS, Sch Artificial Intelligence, Beijing 100190, Peoples R China
[2] Chinese Acad Sci CASIA, Inst Automat, Ctr Res Intelligent Percept & Comp CRIPAC, Natl Lab Pattern Recognit NLPR, Beijing 100190, Peoples R China
[3] Shandong Univ Sci & Technol SDUST, Coll Elect Engn & Automation, Qingdao 266590, Peoples R China
[4] Chinese Acad Sci CAS AIR, Artificial Intelligence Res, Beijing 100190, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Action recognition; skeleton sequence; graph convolutional network; EfficientNet; separable convolution; PERSON REIDENTIFICATION;
D O I
10.1109/TPAMI.2022.3157033
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One essential problem in skeleton-based action recognition is how to extract discriminative features over all skeleton joints. However, the complexity of the recent State-Of-The-Art (SOTA) models for this task tends to be exceedingly sophisticated and over-parameterized. The low efficiency in model training and inference has increased the validation costs of model architectures in large-scale datasets. To address the above issue, recent advanced separable convolutional layers are embedded into an early fused Multiple Input Branches (MIB) network, constructing an efficient Graph Convolutional Network (GCN) baseline for skeleton-based action recognition. In addition, based on such the baseline, we design a compound scaling strategy to expand the model's width and depth synchronously, and eventually obtain a family of efficient GCN baselines with high accuracies and small amounts of trainable parameters, termed EfficientGCN-Bx, where "x " denotes the scaling coefficient. On two large-scale datasets, i.e., NTU RGB+D 60 and 120, the proposed EfficientGCN-B4 baseline outperforms other SOTA methods, e.g., achieving 92.1% accuracy on the cross-subject benchmark of NTU 60 dataset, while being 5.82x smaller and 5.85x faster than MS-G3D, which is one of the SOTA methods. The source code in PyTorch version and the pretrained models are available at https://github.com/yfsong0709/EfficientGCNv1.
引用
收藏
页码:1474 / 1488
页数:15
相关论文
共 50 条
  • [21] Convolutional relation network for skeleton-based action recognition
    Zhu, Jiagang
    Zou, Wei
    Zhu, Zheng
    Hu, Yiming
    NEUROCOMPUTING, 2019, 370 : 109 - 117
  • [22] SKELETON-BASED ACTION RECOGNITION WITH CONVOLUTIONAL NEURAL NETWORKS
    Li, Chao
    Zhong, Qiaoyong
    Xie, Di
    Pu, Shiliang
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2017,
  • [23] A Spatiotemporal Fusion Network For Skeleton-Based Action Recognition
    Bao, Wenxia
    Wang, Junyi
    Yang, Xianjun
    Chen, Hemu
    2024 3RD INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND MEDIA COMPUTING, ICIPMC 2024, 2024, : 347 - 352
  • [24] Memory Attention Networks for Skeleton-Based Action Recognition
    Li, Ce
    Xie, Chunyu
    Zhang, Baochang
    Han, Jungong
    Zhen, Xiantong
    Chen, Jie
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (09) : 4800 - 4814
  • [25] SkeleTR: Towards Skeleton-based Action Recognition in the Wild
    Duan, Haodong
    Xu, Mingze
    Shuai, Bing
    Modolo, Davide
    Tu, Zhuowen
    Tighe, Joseph
    Bergamo, Alessandro
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 13588 - 13598
  • [26] Memory Attention Networks for Skeleton-based Action Recognition
    Xie, Chunyu
    Li, Ce
    Zhang, Baochang
    Chen, Chen
    Han, Jungong
    Liu, Jianzhuang
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 1639 - 1645
  • [27] SKELETON-BASED ACTION RECOGNITION USING LSTM AND CNN
    Li, Chuankun
    Wang, Pichao
    Wang, Shuang
    Hou, Yonghong
    Li, Wanqing
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2017,
  • [28] Pose Encoding for Robust Skeleton-Based Action Recognition
    Demisse, Girum G.
    Papadopoulos, Konstantinos
    Aouada, Djamila
    Ottersten, Bjorn
    PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, : 301 - 307
  • [29] Hypergraph Neural Network for Skeleton-Based Action Recognition
    Hao, Xiaoke
    Li, Jie
    Guo, Yingchun
    Jiang, Tao
    Yu, Ming
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 2263 - 2275
  • [30] Skeleton-based Action Recognition for Industrial Packing Process
    Chen, Zhenhui
    Hu, Haiyang
    Li, Zhongjin
    Qi, Xingchen
    Zhang, Haiping
    Hu, Hua
    Chang, Victor
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON INTERNET OF THINGS, BIG DATA AND SECURITY (IOTBDS), 2020, : 36 - 45