Fusing angular features for skeleton-based action recognition using multi-stream graph convolution network

被引:0
作者
Huang, Qian [1 ,2 ]
Liu, Wenting [1 ,2 ]
Shang, Mingzhou [1 ,2 ]
Wang, Yiming [1 ,2 ]
机构
[1] Hohai Univ, Coll Comp Sci & Software Engn, Nanjing, Peoples R China
[2] Hohai Univ, Key Lab Water Big Data Technol, Minist Water Resources, Nanjing, Peoples R China
关键词
computer vision; video signal processing;
D O I
10.1049/ipr2.13041
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Distinguishing similar actions has been a challenging challenge in skeleton-based action recognition. Since the joint coordinates in these actions are similar, it is difficult to accomplish the recognition task using traditional joint features. To address this issue, the use of angle features to capture subtle nuances in various body parts, along with a critical angle enhancement module that assigns weights to different angle feature representations for a given action are proposed, highlighting the critical angle feature representation. The approach is evaluated using a three-stream ensemble method on three large action recognition datasets, NTU-RGB+D, NTU-RGB+D 120, and Kinetics-400. The experimental results demonstrate that incorporating angular information can effectively complement joint and skeletal features, leading to improved recognition of similar actions and enhanced model performance and robustness. Human behaviour recognition is an important research direction in the field of computer vision, with broad application prospects in areas such as human-computer interaction, smart healthcare, video surveillance, and sports motion analysis. However, current skeleton-based behaviour recognition methods using graph convolutional networks still face some challenges, such as the difficulty of fully utilizing the dependencies among distant nodes and distinguishing similar actions. To address the limitations of existing graph convolution-based models in distinguishing similar actions, a multi-stream hierarchical perception graph convolutional network model that incorporates angle features is proposed. This model introduces four new angle feature representations to capture subtle variations in different body parts, providing discriminative features to differentiate action details. Additionally, it utilizes a key angle feature enhancement module to strengthen important angle features for specific actions. The model achieves recognition accuracies of 92.8% and 96.8% under the cross-subject and cross-view evaluation criteria of the NTU-RGB+D dataset, respectively, and attains accuracies of 89.2% and 90.8% under the cross-subject and cross-setup evaluation criteria of the NTU-RGB+D 120 dataset. The experimental results validate that angle information effectively enhances the model's accuracy and improves its ability to distinguish similar actions. image
引用
收藏
页码:1694 / 1709
页数:16
相关论文
共 46 条
  • [1] Coding Kendall's Shape Trajectories for 3D Action Recognition
    Ben Tanfous, Amor
    Drira, Hassen
    Ben Amor, Boulbaba
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2840 - 2849
  • [2] Skeleton Image Representation for 3D Action Recognition based on Tree Structure and Reference Joints
    Caetano, Carlos
    Bremond, Francois
    Schwartz, William Robson
    [J]. 2019 32ND SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 2019, : 16 - 23
  • [3] Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition
    Chen, Tailin
    Zhou, Desen
    Wang, Jian
    Wang, Shidong
    Guan, Yu
    He, Xuming
    Ding, Errui
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4334 - 4342
  • [4] Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition
    Chen, Yuxin
    Zhang, Ziqi
    Yuan, Chunfeng
    Li, Bing
    Deng, Ying
    Hu, Weiming
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13339 - 13348
  • [5] Skeleton-Based Action Recognition with Shift Graph Convolutional Network
    Cheng, Ke
    Zhang, Yifan
    He, Xiangyu
    Chen, Weihan
    Cheng, Jian
    Lu, Hanqing
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 180 - 189
  • [6] Part-wise Spatio-temporal Attention Driven CNN-based 3D Human Action Recognition
    Dhiman, Chhavi
    Vishwakarma, Dinesh Kumar
    Agarwal, Paras
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (03)
  • [7] A comparative review of graph convolutional networks for human skeleton-based action recognition
    Feng, Liqi
    Zhao, Yaqin
    Zhao, Wenxuan
    Tang, Jiaxi
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (05) : 4275 - 4305
  • [8] Huang LJ, 2020, AAAI CONF ARTIF INTE, V34, P11045
  • [9] Spatio-Temporal Inception Graph Convolutional Networks for Skeleton-Based Action Recognition
    Huang, Zhen
    Shen, Xu
    Tian, Xinmei
    Li, Houqiang
    Huang, Jianqiang
    Hua, Xian-Sheng
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 2122 - 2130
  • [10] Kay W., 2017, ARXIV170506950, V1705, P06950