Action Recognition Based on 3D Skeleton and RGB Frame Fusion

被引:0
|
作者
Liu, Guiyu [1 ]
Qian, Jiuchao [1 ]
Wen, Fei [1 ]
Zhu, Xiaoguang [1 ]
Ying, Rendong [1 ]
Liu, Peilin [1 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai, Peoples R China
来源
2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) | 2019年
关键词
SEGMENTATION;
D O I
10.1109/iros40897.2019.8967570
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Action recognition has wide applications in assisted living, health monitoring, surveillance, and human-computer interaction. In traditional action recognition methods, RGB video-based ones are effective but computationally inefficient, while skeleton-based ones are computationally efficient but do not make use of low-level detail information. This work considers action recognition based on a multimodal fusion between the 3D skeleton and the RGB image. We design a neural network that uses a 3D skeleton sequence and a single middle frame from an RGB video as input. Specifically, our method picks up one frame in a video and extracts spatial features from it using two attention modules, a self-attention module and a skeleton-attention module. Further, temporal features are extracted from the skeleton sequence via a BI-LSTM sub-network. Finally, the spatial features and the temporal features are combined via a feature fusion network for action classification. A distinct feature of our method is that it uses only a single RGB frame rather than an RGB video. Accordingly, it has a light-weighted architecture and is more efficient than RGB video-based methods. Comparative evaluation on two public datasets, NTU-RGBD and SYSU, demonstrates that, our method can achieve competitive performance compared with state-of-the-art methods.
引用
收藏
页码:258 / 264
页数:7
相关论文
共 50 条
  • [31] ACTION RECOGNITION USING JOINT COORDINATES OF 3D SKELETON DATA
    Batabyal, Tamal
    Chattopadhyay, Tanushyam
    Mukherjee, Dipti Prasad
    2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 4107 - 4111
  • [32] Modeling the skeleton-language uncertainty for 3D action recognition
    Wang, Mingdao
    Zhang, Xianlin
    Chen, Siqi
    Li, Xueming
    Zhang, Yue
    NEUROCOMPUTING, 2024, 608
  • [33] Learning hierarchical 3D kernel descriptors for RGB-D action recognition
    Kong, Yu
    Satarboroujeni, Behnam
    Fu, Yun
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2016, 144 : 14 - 23
  • [34] Enhancing Robustness of Viewpoint Changes in 3D Skeleton-Based Human Action Recognition
    Park, Jinyoon
    Kim, Chulwoong
    Kim, Seung-Chan
    MATHEMATICS, 2023, 11 (15)
  • [35] 3D HUMAN ACTION RECOGNITION BASED ON THE SPATIAL-TEMPORAL MOVING SKELETON DESCRIPTOR
    Yao, Hongxian
    Jiang, Xinghao
    Sun, Tanfeng
    Wang, Shilin
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 937 - 942
  • [36] Spatiotemporal decoupling attention transformer for 3D skeleton-based driver action recognition
    Xu, Zhuoyan
    Xu, Jingke
    COMPLEX & INTELLIGENT SYSTEMS, 2025, 11 (04)
  • [37] Human skeleton representation for 3D action recognition based on complex network coding and LSTM
    Shen, Xiangpei
    Ding, Yanrui
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 82
  • [38] Rethinking the ST-GCNs for 3D skeleton-based human action recognition
    Peng, Wei
    Shi, Jingang
    Varanka, Tuomas
    Zhao, Guoying
    NEUROCOMPUTING, 2021, 454 : 45 - 53
  • [39] Deep Learning-Based Action Recognition Using 3D Skeleton Joints Information
    Tasnim, Nusrat
    Islam, Md. Mahbubul
    Baek, Joong-Hwan
    INVENTIONS, 2020, 5 (03) : 1 - 15
  • [40] Skeleton Image Representation for 3D Action Recognition based on Tree Structure and Reference Joints
    Caetano, Carlos
    Bremond, Francois
    Schwartz, William Robson
    2019 32ND SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 2019, : 16 - 23