Direction-guided two-stream convolutional neural networks for skeleton-based action recognition

被引:0
作者
Benyue Su
Peng Zhang
Manzhen Sun
Min Sheng
机构
[1] Anqing Normal University,Key Laboratory of Intelligent Perception and Computing of Anhui Province
[2] Anqing Normal University,School of Computer and Information
[3] Tongling University,School of Mathematics and Computer
[4] Anqing Normal University,School of Mathematics and Physics
来源
Soft Computing | 2023年 / 27卷
关键词
Action recognition; Skeleton data; Direction; Edge-level information; Motion information; Feature fusion;
D O I
暂无
中图分类号
学科分类号
摘要
In skeleton-based action recognition, treating skeleton data as pseudoimages using convolutional neural networks (CNNs) has proven to be effective. However, among existing CNN-based approaches, most focus on modeling information at the joint-level ignoring the size and direction information of the skeleton edges, which play an important role in action recognition, and these approaches may not be optimal. In addition, combining the directionality of human motion to portray action motion variation information is rarely considered in existing approaches, although it is more natural and reasonable for action sequence modeling. In this work, we propose a novel direction-guided two-stream convolutional neural network for skeleton-based action recognition. In the first stream, our model focuses on our defined edge-level information (including edge and edge_motion information) with directionality in the skeleton data to explore the spatiotemporal features of the action. In the second stream, since the motion is directional, we define different skeleton edge directions and extract different motion information (including translation and rotation information) in different directions to better exploit the motion features of the action. In addition, we propose a description of human motion inscribed by a combination of translation and rotation, and explore how they are integrated. We conducted extensive experiments on two challenging datasets, the NTU-RGB+D 60 and NTU-RGB+D 120 datasets, to verify the superiority of our proposed method over state-of-the-art methods. The experimental results demonstrate that the proposed direction-guided edge-level information and motion information complement each other for better action recognition.
引用
收藏
页码:11833 / 11842
页数:9
相关论文
共 43 条
[1]  
Hou Y(2016)Skeleton optical spectra-based action recognition using convolutional neural networks IEEE Trans Circuits Syst Video Technol 28 807-811
[2]  
Li Z(2021)Local-aware spatio-temporal attention network with multi-stage feature fusion for human action recognition Neural Comput Appl 33 16,439-16,450
[3]  
Wang P(2020)Spatiotemporal neural networks for action recognition based on joint loss Neural Comput Appl 32 4293-4302
[4]  
Hou Y(2018)Learning clip representations for skeleton-based 3d action recognition IEEE Trans Image Process 27 2842-2855
[5]  
Yu H(2017)Joint distance maps based action recognition with convolutional neural networks IEEE Signal Process Lett 24 624-628
[6]  
Zhou D(2017)Skeleton-based human action recognition with global context-aware attention lstm networks IEEE Trans Image Process 27 1586-1599
[7]  
Jing C(2019)Ntu rgb+ d 120: a large-scale benchmark for 3d human activity understanding IEEE Trans Pattern Anal Mach Intell 42 2684-2701
[8]  
Wei P(2017)Enhanced skeleton visualization for view invariant human action recognition Pattern Recogn 68 346-362
[9]  
Sun H(2021)Adaptive multi-view graph convolutional networks for skeleton-based action recognition Neurocomputing 444 288-300
[10]  
Ke Q(2021)Spatio temporal joint distance maps for skeleton-based action recognition using convolutional neural networks Int J Image Graphics 21 2140,001-631