Two-stream spatial-temporal neural networks for pose-based action recognition

被引：2

作者：

Wang, Zixuan ^{[1
]}

Zhu, Aichun ^{[1
,2
]}

Hu, Fangqiang ^{[1
]}

Wu, Qianyu ^{[1
]}

Li, Yifeng ^{[1
]}

机构：

[1] Nanjing Tech Univ, Sch Comp Sci & Technol, Nanjing, Peoples R China

[2] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou, Jiangsu, Peoples R China

来源：

JOURNAL OF ELECTRONIC IMAGING | 2020年 / 29卷 / 04期

基金：

中国国家自然科学基金; 中国博士后科学基金;

关键词：

action recognition; pose estimation; convolutional neural network; long short-term memory;

D O I：

10.1117/1.JEI.29.4.043025

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

With recent advances in human pose estimation and human skeleton capture systems, pose-based action recognition has drawn lots of attention among researchers. Although most existing action recognition methods are based on convolutional neural network and long short-term memory, which present outstanding performance, one of the shortcomings of these methods is that they lack the ability to explicitly exploit the rich spatial-temporal information between the skeletons in the behavior, so they are not conducive to improving the accuracy of action recognition. To better address this issue, the two-stream spatial-temporal neural networks for pose-based action recognition is introduced. First, the pose features that are extracted from the raw video are processed by an action modeling module. Then, the temporal information and the spatial information, in the form of relative speed and relative distance, are fed into the temporal neural network and the spatial neural network, respectively. Afterward, the outputs of two-stream networks are fused for better action recognition. Finally, we perform comprehensive experiments on the SUB-JHMDB, SYSU, MPII-Cooking, and NTU RGB+D datasets, the results of which demonstrate the effectiveness of the proposed model. (C) 2020 SPIE and IS&T

引用

页数：16

共 50 条

[31] Human Action Recognition Based on Improved Two-Stream Convolution Network
Wang, Zhongwen
Lu, Haozhu
Jin, Junlan
Hu, Kai
APPLIED SCIENCES-BASEL, 2022, 12 (12):
[32] Probabilistic Discriminative Dimensionality Reduction for Pose-Based Action Recognition
Ntouskos, Valsamis
Papadakis, Panagiotis
Pirri, Fiora
PATTERN RECOGNITION APPLICATIONS AND METHODS, ICPRAM 2013, 2015, 318 : 137 - 152
[33] Exploiting Attention-Consistency Loss For Spatial-Temporal Stream Action Recognition
Xu, Haotian
Jin, Xiaobo
Wang, Qiufeng
Hussain, Amir
Huang, Kaizhu
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2022, 18 (02)
[34] Two-stream Deep Representation for Human Action Recognition
Ghrab, Najla Bouarada
Fendri, Emna
Hammami, Mohamed
FOURTEENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2021), 2022, 12084
[35] Improved two-stream model for human action recognition
Zhao, Yuxuan
Man, Ka Lok
Smith, Jeremy
Siddique, Kamran
Guan, Sheng-Uei
EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2020, 2020 (01)
[36] Two-Stream Dictionary Learning Architecture for Action Recognition
Xu, Ke
Jiang, Xinghao
Sun, Tanfeng
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2017, 27 (03) : 567 - 576
[37] Action Recognition Based on Two-Stream Convolutional Networks With Long-Short-Term Spatiotemporal Features
Wan, Yanqin
Yu, Zujun
Wang, Yao
Li, Xingxin
IEEE ACCESS, 2020, 8 (08): : 85284 - 85293
[38] Improved two-stream model for human action recognition
Yuxuan Zhao
Ka Lok Man
Jeremy Smith
Kamran Siddique
Sheng-Uei Guan
EURASIP Journal on Image and Video Processing, 2020
[39] A Multimode Two-Stream Network for Egocentric Action Recognition
Li, Ying
Shen, Jie
Xiong, Xin
He, Wei
Li, Peng
Yan, Wenjie
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT I, 2021, 12891 : 357 - 368
[40] A Spatiotemporal Heterogeneous Two-Stream Network for Action Recognition
Chen, Enqing
Bai, Xue
Gao, Lei
Tinega, Haron Chweya
Ding, Yingqiang
IEEE ACCESS, 2019, 7 : 57267 - 57275

← 1 2 3 4 5 →