Spatio-temporal attention on manifold space for 3D human action recognition

被引:0
作者
Chongyang Ding
Kai Liu
Fei Cheng
Evgeny Belyaev
机构
[1] Xidian University,Department of Computer Science and Technology
[2] ITMO University,Department of Information Systems
来源
Applied Intelligence | 2021年 / 51卷
关键词
Skeleton-based; Action recognition; Spatial attention; Temporal attention; Manifold space;
D O I
暂无
中图分类号
学科分类号
摘要
Recently, skeleton-based action recognition has become increasingly prevalent in computer vision due to its wide range of applications, and many approaches have been proposed to address this task. Among these methods, manifold space is widely used to deal with the relative geometric relationships between different body parts in human skeletons. Existing studies treat all geometric relationships as having the same degree of importance; thus, they cannot focus on significant information. In addition, the traditional attention mechanism aims mostly to solve the attention problems in Euclidean space, and is not applicable in manifold space. To investigate these issues, we propose a spatial and temporal attention mechanism on Lie groups for 3D human action recognition. We build our network architecture with a generalized attention mechanism that extends the scope of attention from traditional Euclidean space to manifold space. In addition, our model can learn to identify the significant spatial features and temporal stages with effective attention modules, which focus on discriminative transformation relationships between different rigid bodies within each frame and allocate different levels of attention to different frames. Extensive experiments are conducted on standard datasets and the experimental results demonstrate the effectiveness of the proposed network architecture.
引用
收藏
页码:560 / 570
页数:10
相关论文
共 86 条
[1]  
Anirudh R(2017)Elastic functional coding of riemannian trajectories IEEE Trans Pattern Anal Mach Intell 39 922-936
[2]  
Turaga P(2011)A discrete regression method on manifolds and its application to data on so (n) IFAC Proc 44 2284-2289
[3]  
Su J(2015)Effective active skeleton representation for low latency human action recognition IEEE Trans Multimed 18 141-154
[4]  
Srivastava A(2002)Control of goal-directed and stimulus-driven attention in the brain Nat Rev Neurosci 3 201-82
[5]  
Boumal N(2017)Improving human-robot interaction based on joint attention Appl Intell 47 62-374
[6]  
Absil PA(2018)Attention-based multiview re-observation fusion network for skeletal action recognition IEEE Trans Multimed 21 363-67
[7]  
Cai X(2020)Unsupervised emotional state classification through physiological parameters for social robotics applications. Knowl-Based Syst 190 105217-610
[8]  
Zhou W(2020)Siamese attentional keypoint network for high performance visual tracking. Knowl-Based Syst 193 105448-211
[9]  
Wu L(2020)Learning reinforced attentional representation for end-to-end visual tracking Inf Sci 517 52-219
[10]  
Luo J(2005)Framewise phoneme classification with bidirectional lstm and other neural network architectures Neural Netw 18 602-428