Improved SSD using deep multi-scale attention spatial–temporal features for action recognition

被引:0
作者
Shuren Zhou
Jia Qiu
Arun Solanki
机构
[1] Changsha University of Science and Technology,School of Computer and Communication Engineering
[2] Gautam Buddha University,School of Information and Communication Technology
来源
Multimedia Systems | 2022年 / 28卷
关键词
Action recognition; Multi-scale spatial–temporal feature; Attention mechanism;
D O I
暂无
中图分类号
学科分类号
摘要
The biggest difference between video-based action recognition and image-based action recognition is that the former has an extra feature of time dimension. Most methods of action recognition based on deep learning adopt: (1) using 3D convolution to modeling the temporal features; (2) introducing an auxiliary temporal feature, such as optical flow. However, the 3D convolution network usually consumes huge computational resources. The extraction of optical flow requires an extra tedious process with an extra space for storage, and is usually modeled for short-range temporal features. To construct the temporal features better, in this paper we propose a multi-scale attention spatial–temporal features network based on SSD, by means of piecewise on long range of the whole video sequence to sparse sampling of video, using the self-attention mechanism to capture the relation between one frame and the sequence of frames sampled on the entire range of video, making the network notice the representative frames on the sequence. Moreover, the attention mechanism is used to assign different weights to the inter-frame relations representing different time scales, so as to reasoning the contextual relations of actions in the time dimension. Our proposed method achieves competitive performance on two commonly used datasets: UCF101 and HMDB51.
引用
收藏
页码:2123 / 2131
页数:8
相关论文
共 58 条
  • [1] Fusier F(2007)Video understanding for complex activity recognition[J] Mach. Vis. Appl. 18 167-188
  • [2] Valentin V(2019)An encrypted image retrieval method based on Harris corner optimization and LSH in cloud computing IEEE Access 7 24626-24633
  • [3] Bremond F(2017)Efficient and secure attribute-based signature for monotone predicates Acta Inform. 54 521-541
  • [4] Qin J(2018)An enhanced PEGASIS algorithm with mobile sink support for wireless sensor networks Wirel. Commun. Mob. Comput. 29 2247-2253
  • [5] Li H(2007)Actions as space–time shapes TPAMI 1 8-4665
  • [6] Xiang X(2013)Action recognition with improved trajectories ICCV 64 5-336
  • [7] Tan Y(2005)On space-time interest points IJCV 97 4651-1731
  • [8] Pan W(2017)UPTP vehicle trajectory prediction based on user preference under complexity environment Wirel. Pers. Commun. 62 321-231
  • [9] Ma W(2020)Parameters compressing in deep learning CMC 24 1722-182
  • [10] Xiong NN(2019)A decision function based smart charging and discharging strategy for electric vehicle in smart grid Mob. Netw. Appl. 35 221-589