Transformer for Skeleton-based action recognition: A review of recent advances

被引:38
|
作者
Xin, Wentian [1 ,2 ]
Liu, Ruyi [1 ,2 ]
Liu, Yi [1 ,2 ]
Chen, Yu [1 ,2 ]
Yu, Wenxin [1 ,2 ]
Miao, Qiguang [1 ,2 ]
机构
[1] Xian Key Lab Big Data & Intelligent Vis, Xian 710071, Shaanxi, Peoples R China
[2] Xidian Univ, Sch Comp Sci & Technol, 2 Taibainan Rd, Xian 710071, Shaanxi, Peoples R China
关键词
Transformer; Graph convolution network; Skeleton -based action recognition; Spatial temporal structure; Survey; VIDEO SURVEILLANCE; NETWORKS; SYSTEM;
D O I
10.1016/j.neucom.2023.03.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Skeleton-based action recognition has rapidly become one of the most popular and essential research topics in computer vision. The task is to analyze the characteristics of human joints and accurately clas-sify their behaviors through deep learning technology. Skeleton provides numerous unique advantages over other data modalities, such as robustness, compactness, noise immunity, etc. In particular, the skele-ton modality is extremely lightweight, which is especially beneficial for deep learning research in low -resource environments. Due to the non-European nature of skeleton data, Graph Convolution Network (GCN) has become mainstream in the past few years, leveraging the benefits of processing topological information. However, with the explosive development of transformer methods in natural language pro-cessing and computer vision, many works have applied transformer into the field of skeleton action recognition, breaking the accuracy monopoly of GCN. Therefore, we conduct a survey using transformer method for skeleton-based action recognition, forming of a taxonomy on existing works. This paper gives a comprehensive overview of the recent transformer techniques for skeleton action recognition, proposes a taxonomy of transformer-style techniques for action recognition, conducts a detailed study on bench-mark datasets, compares the algorithm accuracy of standard methods, and finally discusses the future research directions and trends. To the best of our knowledge, this study is the first to describe skeleton-based action recognition techniques in the style of transformers and to suggest novel recogni-tion taxonomies in a review. We are confident that Transformer-based action recognition technology will become mainstream in the near future, so this survey aims to help researchers systematically learn core tasks, select appropriate datasets, understand current challenges, and select promising future directions.(c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页码:164 / 186
页数:23
相关论文
共 50 条
  • [1] A GCN and Transformer complementary network for skeleton-based action recognition
    Xiang, Xuezhi
    Li, Xiaoheng
    Liu, Xuzhao
    Qiao, Yulong
    El Saddik, Abdulmotaleb
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 249
  • [2] Graph-aware transformer for skeleton-based action recognition
    Zhang, Jiaxu
    Xie, Wei
    Wang, Chao
    Tu, Ruide
    Tu, Zhigang
    VISUAL COMPUTER, 2023, 39 (10): : 4501 - 4512
  • [3] Deformable graph convolutional transformer for skeleton-based action recognition
    Shuo Chen
    Ke Xu
    Bo Zhu
    Xinghao Jiang
    Tanfeng Sun
    Applied Intelligence, 2023, 53 : 15390 - 15406
  • [4] Deformable graph convolutional transformer for skeleton-based action recognition
    Chen, Shuo
    Xu, Ke
    Zhu, Bo
    Jiang, Xinghao
    Sun, Tanfeng
    APPLIED INTELLIGENCE, 2023, 53 (12) : 15390 - 15406
  • [5] Graph-aware transformer for skeleton-based action recognition
    Jiaxu Zhang
    Wei Xie
    Chao Wang
    Ruide Tu
    Zhigang Tu
    The Visual Computer, 2023, 39 : 4501 - 4512
  • [6] LORTSAR: Low-Rank Transformer for Skeleton-Based Action Recognition
    Oraki, Soroush
    Zhuang, Harry
    Liang, Jie
    ADVANCES IN VISUAL COMPUTING, ISVC 2024, PT I, 2025, 15046 : 196 - 207
  • [7] Improved ELBO-assisted Transformer for Skeleton-Based Action Recognition
    Bhattacharjee, Arnab
    Chen, Wen-Hui
    Lin, Yu-Chen
    Lai, Kuan-Ting
    2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 3997 - 4002
  • [8] Skeleton-based action recognition via spatial and temporal transformer networks
    Plizzari, Chiara
    Cannici, Marco
    Matteucci, Matteo
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 208 (208-209)
  • [9] Revisiting Skeleton-based Action Recognition
    Duan, Haodong
    Zhao, Yue
    Chen, Kai
    Lin, Dahua
    Dai, Bo
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 2959 - 2968
  • [10] STSD: spatial–temporal semantic decomposition transformer for skeleton-based action recognition
    Hu Cui
    Tessai Hayama
    Multimedia Systems, 2024, 30