Transformer for Skeleton-based action recognition: A review of recent advances

被引:38
作者
Xin, Wentian [1 ,2 ]
Liu, Ruyi [1 ,2 ]
Liu, Yi [1 ,2 ]
Chen, Yu [1 ,2 ]
Yu, Wenxin [1 ,2 ]
Miao, Qiguang [1 ,2 ]
机构
[1] Xian Key Lab Big Data & Intelligent Vis, Xian 710071, Shaanxi, Peoples R China
[2] Xidian Univ, Sch Comp Sci & Technol, 2 Taibainan Rd, Xian 710071, Shaanxi, Peoples R China
关键词
Transformer; Graph convolution network; Skeleton -based action recognition; Spatial temporal structure; Survey; VIDEO SURVEILLANCE; NETWORKS; SYSTEM;
D O I
10.1016/j.neucom.2023.03.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Skeleton-based action recognition has rapidly become one of the most popular and essential research topics in computer vision. The task is to analyze the characteristics of human joints and accurately clas-sify their behaviors through deep learning technology. Skeleton provides numerous unique advantages over other data modalities, such as robustness, compactness, noise immunity, etc. In particular, the skele-ton modality is extremely lightweight, which is especially beneficial for deep learning research in low -resource environments. Due to the non-European nature of skeleton data, Graph Convolution Network (GCN) has become mainstream in the past few years, leveraging the benefits of processing topological information. However, with the explosive development of transformer methods in natural language pro-cessing and computer vision, many works have applied transformer into the field of skeleton action recognition, breaking the accuracy monopoly of GCN. Therefore, we conduct a survey using transformer method for skeleton-based action recognition, forming of a taxonomy on existing works. This paper gives a comprehensive overview of the recent transformer techniques for skeleton action recognition, proposes a taxonomy of transformer-style techniques for action recognition, conducts a detailed study on bench-mark datasets, compares the algorithm accuracy of standard methods, and finally discusses the future research directions and trends. To the best of our knowledge, this study is the first to describe skeleton-based action recognition techniques in the style of transformers and to suggest novel recogni-tion taxonomies in a review. We are confident that Transformer-based action recognition technology will become mainstream in the near future, so this survey aims to help researchers systematically learn core tasks, select appropriate datasets, understand current challenges, and select promising future directions.(c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页码:164 / 186
页数:23
相关论文
共 198 条
  • [1] Graph Convolutional Neural Network for Human Action Recognition: A Comprehensive Survey
    Ahmad T.
    Jin L.
    Zhang X.
    Lai S.
    Tang G.
    Lin L.
    [J]. IEEE Transactions on Artificial Intelligence, 2021, 2 (02): : 128 - 145
  • [2] Ahn D, 2022, Arxiv, DOI [arXiv:2210.07503, 10.48550/arXiv.2210.07503]
  • [3] Design and analysis of logistic agent-based swarm-neural network for intelligent transportation system
    Alkinani, Monagi H.
    Almazroi, Abdulwahab Ali
    Adhikari, Mainak
    Menon, Varun G.
    [J]. ALEXANDRIA ENGINEERING JOURNAL, 2022, 61 (10) : 8325 - 8334
  • [4] Enhanced discriminative graph convolutional network with adaptive temporal modelling for skeleton-based action recognition
    Alsarhan, Tamam
    Ali, Usman
    Lu, Hongtao
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2022, 216
  • [5] [Anonymous], 2021, arXiv
  • [6] [Anonymous], 2020, ARXIV
  • [7] Social Scene Understanding: End-to-End Multi-Person Action Localization and Collective Activity Recognition
    Bagautdinov, Timur
    Alahi, Alexandre
    Fleuret, Francois
    Fua, Pascal
    Savarese, Silvio
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3425 - 3434
  • [8] Bai R., 2022, 2022 IEEE INT C MULT, P01
  • [9] Bai RW, 2022, Arxiv, DOI arXiv:2109.02860
  • [10] A union of deep learning and swarm-based optimization for 3D human action recognition
    Basak, Hritam
    Kundu, Rohit
    Singh, Pawan Kumar
    Ijaz, Muhammad Fazal
    Wozniak, Marcin
    Sarkar, Ram
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01)