Spatio-temporal attention on manifold space for 3D human action recognition

被引:22
作者
Ding, Chongyang [1 ]
Liu, Kai [1 ]
Cheng, Fei [1 ]
Belyaev, Evgeny [2 ]
机构
[1] Xidian Univ, Dept Comp Sci & Technol, Xian, Peoples R China
[2] ITMO Univ, Dept Informat Syst, St Petersburg, Russia
基金
中国国家自然科学基金;
关键词
Skeleton-based; Action recognition; Spatial attention; Temporal attention; Manifold space; NETWORK; LSTM;
D O I
10.1007/s10489-020-01803-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, skeleton-based action recognition has become increasingly prevalent in computer vision due to its wide range of applications, and many approaches have been proposed to address this task. Among these methods, manifold space is widely used to deal with the relative geometric relationships between different body parts in human skeletons. Existing studies treat all geometric relationships as having the same degree of importance; thus, they cannot focus on significant information. In addition, the traditional attention mechanism aims mostly to solve the attention problems in Euclidean space, and is not applicable in manifold space. To investigate these issues, we propose a spatial and temporal attention mechanism on Lie groups for 3D human action recognition. We build our network architecture with a generalized attention mechanism that extends the scope of attention from traditional Euclidean space to manifold space. In addition, our model can learn to identify the significant spatial features and temporal stages with effective attention modules, which focus on discriminative transformation relationships between different rigid bodies within each frame and allocate different levels of attention to different frames. Extensive experiments are conducted on standard datasets and the experimental results demonstrate the effectiveness of the proposed network architecture.
引用
收藏
页码:560 / 570
页数:11
相关论文
共 54 条
[1]  
Absil PA, 2008, OPTIMIZATION ALGORITHMS ON MATRIX MANIFOLDS, P1
[2]   Elastic Functional Coding of Riemannian Trajectories [J].
Anirudh, Rushil ;
Turaga, Pavan ;
Su, Jingyong ;
Srivastava, Anuj .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (05) :922-936
[3]  
Anirudh R, 2015, PROC CVPR IEEE, P3147, DOI 10.1109/CVPR.2015.7298934
[4]  
[Anonymous], 2017, P ANN REL MAINT S RA, DOI [DOI 10.1609/AAAI.V31I1.10866, DOI 10.1109/RAM.2017.7889722]
[5]  
[Anonymous], 2014, ARXIV14091556
[6]  
[Anonymous], 2012, 2012 IEEE COMP SOC C
[7]  
Ba J., 2014, Multiple object recognition with visual attention
[8]  
Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
[9]   Coding Kendall's Shape Trajectories for 3D Action Recognition [J].
Ben Tanfous, Amor ;
Drira, Hassen ;
Ben Amor, Boulbaba .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :2840-2849
[10]  
Boumal N., 2011, IFAC Proc, V44, P2284, DOI [10.3182/20110828-6-IT-1002.00542, DOI 10.3182/20110828-6-IT-1002.00542]