Advancing skeleton-based human behavior recognition: multi-stream fusion spatiotemporal graph convolutional networks

被引:0
|
作者
Liu, Fenglin [1 ]
Wang, Chenyu [2 ]
Tian, Zhiqiang [2 ]
Du, Shaoyi [3 ]
Zeng, Wei [1 ]
机构
[1] Longyan Univ, Sch Phys & Mech & Elect Engn, Longyan 364012, Peoples R China
[2] Jiaotong Univ, Sch Software Engn, Xian 710049, Peoples R China
[3] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian 710049, Peoples R China
关键词
Behavior recognition; Skeleton-based spatiotemporal graph convolutional network; Multi-stream fusion; Long-range dependencies;
D O I
10.1007/s40747-024-01743-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the realm of daily human interactions, a rich tapestry of behaviors and actions is observed, encompassing a wealth of informative cues. In the era of burgeoning big data, extensive repositories of images and videos have risen to prominence as the primary conduits for disseminating information. Grasping the intricacies of human behaviors depicted within these multimedia contexts has evolved into a pivotal quandary within the domain of computer vision. The technology of behavior recognition finds its practical application across domains such as human-computer interaction, intelligent surveillance, and anomaly detection, exhibiting a robust blend of pragmatic utility and scholarly significance. The present study introduces an innovative human body behavior recognition framework anchored in skeleton sequences and multi-stream fused spatiotemporal graph convolutional networks. Developed upon the foundation of graph convolutional networks, this method encompasses three pivotal refinements tailored to ameliorate extant challenges. First and foremost, in response to the complex task of capturing distant interdependencies among nodes within graph convolutional networks, we incorporate a spatial attention module. This module adeptly encapsulates long-term node interdependencies via precision-laden positional information, thus engendering interconnections that span diverse temporal and spatial contexts. Subsequently, to elevate the discernment of channel information within the network and to optimize the allocation of attention across distinct channels, we introduce a channel attention mechanism. This augmentation fortifies the discernment of motion-related features. Lastly, confronting the lacuna of information gaps prevalent within single-stream data, we deploy a multi-stream fusion methodology to fortify model outputs, ultimately fostering more precise prognostications concerning action classifications. Empirical results bear testament to the efficacy of the proposed multi-stream fused spatiotemporal graph convolutional network paradigm for skeleton-centric behavior recognition, evincing a pinnacle recognition accuracy of 96.0% on the expansive NTU-RGB+D skeleton dataset, alongside a zenithal accuracy of 37.3% on the Kinetics-Skeleton dataset-emanating from RGB data and furthered through pose estimation.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] Whole and Part Adaptive Fusion Graph Convolutional Networks for Skeleton-Based Action Recognition
    Zuo, Qi
    Zou, Lian
    Fan, Cien
    Li, Dongqian
    Jiang, Hao
    Liu, Yifeng
    SENSORS, 2020, 20 (24) : 1 - 20
  • [22] A comparative review of graph convolutional networks for human skeleton-based action recognition
    Liqi Feng
    Yaqin Zhao
    Wenxuan Zhao
    Jiaxi Tang
    Artificial Intelligence Review, 2022, 55 : 4275 - 4305
  • [23] A comparative review of graph convolutional networks for human skeleton-based action recognition
    Feng, Liqi
    Zhao, Yaqin
    Zhao, Wenxuan
    Tang, Jiaxi
    ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (05) : 4275 - 4305
  • [24] Dyadic relational graph convolutional networks for skeleton-based human interaction recognition
    Zhu, Liping
    Wan, Bohua
    Li, Chengyang
    Tian, Gangyi
    Hou, Yi
    Yuan, Kun
    PATTERN RECOGNITION, 2021, 115
  • [25] LIGHTWEIGHT CONNECTIVITY IN GRAPH CONVOLUTIONAL NETWORKS FOR SKELETON-BASED RECOGNITION
    Sahbi, Hichem
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2329 - 2333
  • [26] Recurrent graph convolutional networks for skeleton-based action recognition
    Zhu, Guangming
    Yang, Lu
    Zhang, Liang
    Shen, Peiyi
    Song, Juan
    Proceedings - International Conference on Pattern Recognition, 2020, : 1352 - 1359
  • [27] Recurrent Graph Convolutional Networks for Skeleton-based Action Recognition
    Zhu, Guangming
    Yang, Lu
    Zhang, Liang
    Shen, Peiyi
    Song, Juan
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 1352 - 1359
  • [28] Efficient skeleton-based action recognition via multi-stream depthwise separable convolutional neural network
    Yin, Ming
    He, Shaocong
    Soomro, Tourfique Ahemd
    Yuan, Haoliang
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 226
  • [29] Skeleton Action Recognition Based on Multi-Stream Spatial Attention Graph Convolutional SRU Network
    Zhao J.-N.
    She Q.-S.
    Meng M.
    Chen Y.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2022, 50 (07): : 1579 - 1585
  • [30] Multi-Stream and Enhanced Spatial-Temporal Graph Convolution Network for Skeleton-Based Action Recognition
    Li, Fanjia
    Zhu, Aichun
    Xu, Yonggang
    Cui, Ran
    Hua, Gang
    IEEE ACCESS, 2020, 8 : 97757 - 97770