Two-stream spatial-temporal neural networks for pose-based action recognition

被引:2
|
作者
Wang, Zixuan [1 ]
Zhu, Aichun [1 ,2 ]
Hu, Fangqiang [1 ]
Wu, Qianyu [1 ]
Li, Yifeng [1 ]
机构
[1] Nanjing Tech Univ, Sch Comp Sci & Technol, Nanjing, Peoples R China
[2] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou, Jiangsu, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
action recognition; pose estimation; convolutional neural network; long short-term memory;
D O I
10.1117/1.JEI.29.4.043025
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
With recent advances in human pose estimation and human skeleton capture systems, pose-based action recognition has drawn lots of attention among researchers. Although most existing action recognition methods are based on convolutional neural network and long short-term memory, which present outstanding performance, one of the shortcomings of these methods is that they lack the ability to explicitly exploit the rich spatial-temporal information between the skeletons in the behavior, so they are not conducive to improving the accuracy of action recognition. To better address this issue, the two-stream spatial-temporal neural networks for pose-based action recognition is introduced. First, the pose features that are extracted from the raw video are processed by an action modeling module. Then, the temporal information and the spatial information, in the form of relative speed and relative distance, are fed into the temporal neural network and the spatial neural network, respectively. Afterward, the outputs of two-stream networks are fused for better action recognition. Finally, we perform comprehensive experiments on the SUB-JHMDB, SYSU, MPII-Cooking, and NTU RGB+D datasets, the results of which demonstrate the effectiveness of the proposed model. (C) 2020 SPIE and IS&T
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Spatial-Temporal Analysis-Based Video Quality Assessment: A Two-Stream Convolutional Network Approach
    He, Jianghui
    Wang, Zhe
    Liu, Yi
    Song, Yang
    ELECTRONICS, 2024, 13 (10)
  • [22] Spatial-temporal graph attention networks for skeleton-based action recognition
    Huang, Qingqing
    Zhou, Fengyu
    He, Jiakai
    Zhao, Yang
    Qin, Runze
    JOURNAL OF ELECTRONIC IMAGING, 2020, 29 (05)
  • [23] Fuzzy Fusion for Two-stream Action Recognition
    Sousa e Santos, Anderson Carlos
    Maia, Helena de Almeida
    Roberto e Souza, Marcos
    Vieira, Marcelo Bernardes
    Pedrini, Helio
    PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 5: VISAPP, 2020, : 117 - 123
  • [24] Action recognition via pose-based graph convolutional networks with intermediate dense supervision
    Shi, Lei
    Zhang, Yifan
    Cheng, Jian
    Lu, Hanqing
    PATTERN RECOGNITION, 2022, 121
  • [25] Two-stream Action Recognition in Ice Hockey using Player Pose Sequences and Optical Flows
    Vats, Kanav
    Neher, Helmut
    Clausi, David A.
    Zelek, John
    2019 16TH CONFERENCE ON COMPUTER AND ROBOT VISION (CRV 2019), 2019, : 181 - 188
  • [26] Segment spatial-temporal representation and cooperative learning of convolution neural networks for multimodal-based action recognition
    Ren, Ziliang
    Zhang, Qieshi
    Cheng, Jun
    Hao, Fusheng
    Gao, Xiangyang
    NEUROCOMPUTING, 2021, 433 : 142 - 153
  • [27] Spatial-Temporal Attention for Action Recognition
    Sun, Dengdi
    Wu, Hanqing
    Ding, Zhuanlian
    Luo, Bin
    Tang, Jin
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT I, 2018, 11164 : 854 - 864
  • [28] The Very Deep Multi-stage Two-stream Convolutional Neural Network for Action Recognition
    Gao, Xiuju
    Zhang, Hanling
    PROCEEDINGS OF THE 2016 3RD INTERNATIONAL CONFERENCE ON MECHATRONICS AND INFORMATION TECHNOLOGY (ICMIT), 2016, 49 : 265 - 269
  • [29] Skeleton action recognition using Two-Stream Adaptive Graph Convolutional Networks
    Lee, James
    Kang, Suk-ju
    2021 36TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC), 2021,
  • [30] Toward Efficient Action Recognition: Principal Backpropagation for Training Two-Stream Networks
    Huang, Wenbing
    Fan, Lijie
    Harandi, Mehrtash
    Ma, Lin
    Liu, Huaping
    Liu, Wei
    Gan, Chuang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (04) : 1773 - 1782