Sparse Coding on Local Spatial-Temporal Volumes for Human Action Recognition

被引:0
|
作者
Zhu, Yan [1 ]
Zhao, Xu [1 ]
Fu, Yun [2 ]
Liu, Yuncai [1 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai 200240, Peoples R China
[2] SUNY Buffalo, Dept CSE, Buffalo, NY 14260 USA
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
By extracting local spatial-temporal features from videos, many recently proposed approaches for action recognition achieve promising performance. The Bag-of-Words (BoW) model is commonly used in the approaches to obtain the video level representations. However, BoW model roughly assigns each feature vector to its closest visual word, therefore inevitably causing nontrivial quantization errors and impairing further improvements on classification rates. To obtain a more accurate and discriminative representation, in this paper, we propose an approach for action recognition by encoding local 3D spatial-temporal gradient features within the sparse coding framework. In so doing, each local spatial-temporal feature is transformed to a linear combination of a few "atoms" in a trained dictionary. In addition, we also investigate the construction of the dictionary under the guidance of transfer learning. We collect a large set of diverse video clips of sport games and movies, from which a set of universal atoms composed of the dictionary are learned by an online learning strategy. We test our approach on KTH dataset and UCF sports dataset. Experimental results demonstrate that our approach outperforms the state-of-art techniques on KTH dataset and achieves the comparable performance on UCF sports dataset.
引用
收藏
页码:660 / +
页数:3
相关论文
共 50 条
  • [41] Action Recognition Using a Spatial-Temporal Network for Wild Felines
    Feng, Liqi
    Zhao, Yaqin
    Sun, Yichao
    Zhao, Wenxuan
    Tang, Jiaxi
    ANIMALS, 2021, 11 (02): : 1 - 18
  • [42] A SPATIAL-TEMPORAL CONSTRAINT-BASED ACTION RECOGNITION METHOD
    Han, Tingting
    Yao, Hongxun
    Zhang, Yanhao
    Xu, Pengfei
    2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 2767 - 2771
  • [43] A Local Spatial-Temporal Synchronous Network to Dynamic Gesture Recognition
    Zhao, Dongdong
    Yang, Qinglian
    Zhou, Xingwen
    Li, Hongli
    Yan, Shi
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2023, 10 (05) : 2226 - 2233
  • [44] Robust Action Recognition Using Multi-Scale Spatial-Temporal Concatenations of Local Features as Natural Action Structures
    Zhu, Xiaoyuan
    Li, Meng
    Li, Xiaojian
    Yang, Zhiyong
    Tsien, Joe Z.
    PLOS ONE, 2012, 7 (10):
  • [45] Multiple Distilling-based spatial-temporal attention networks for unsupervised human action recognition
    Zhang, Cheng
    Zhong, Jianqi
    Cao, Wenming
    Ji, Jianhua
    INTELLIGENT DATA ANALYSIS, 2024, 28 (04) : 921 - 941
  • [46] Human action recognition via multi-task learning base on spatial-temporal feature
    Guo, Wenzhong
    Chen, Guolong
    INFORMATION SCIENCES, 2015, 320 : 418 - 428
  • [47] Human Action Recognition for Dynamic Scenes of Emergency Rescue Based on Spatial-Temporal Fusion Network
    Zhang, Yongmei
    Guo, Qian
    Du, Zhirong
    Wu, Aiyan
    ELECTRONICS, 2023, 12 (03)
  • [48] Human Action Recognition by Decision-Making Level Fusion Based on Spatial-Temporal Features
    Li Yandi
    Xu Xiping
    ACTA OPTICA SINICA, 2018, 38 (08)
  • [49] 3D Spatial-Temporal View based Motion Tracing in Human Action Recognition
    Silambarasi, R.
    Sahoo, Suraj Prakash
    Ari, Samit
    2017 INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), 2017, : 1833 - 1837
  • [50] Human action recognition based on spatial-temporal relational model and LSTM-CNN framework
    Senthilkumar, N.
    Manimegalai, M.
    Karpakam, S.
    Ashokkumar, S. R.
    Premkumar, M.
    MATERIALS TODAY-PROCEEDINGS, 2022, 57 : 2087 - 2091