Sparse Coding on Local Spatial-Temporal Volumes for Human Action Recognition

被引:0
|
作者
Zhu, Yan [1 ]
Zhao, Xu [1 ]
Fu, Yun [2 ]
Liu, Yuncai [1 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai 200240, Peoples R China
[2] SUNY Buffalo, Dept CSE, Buffalo, NY 14260 USA
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
By extracting local spatial-temporal features from videos, many recently proposed approaches for action recognition achieve promising performance. The Bag-of-Words (BoW) model is commonly used in the approaches to obtain the video level representations. However, BoW model roughly assigns each feature vector to its closest visual word, therefore inevitably causing nontrivial quantization errors and impairing further improvements on classification rates. To obtain a more accurate and discriminative representation, in this paper, we propose an approach for action recognition by encoding local 3D spatial-temporal gradient features within the sparse coding framework. In so doing, each local spatial-temporal feature is transformed to a linear combination of a few "atoms" in a trained dictionary. In addition, we also investigate the construction of the dictionary under the guidance of transfer learning. We collect a large set of diverse video clips of sport games and movies, from which a set of universal atoms composed of the dictionary are learned by an online learning strategy. We test our approach on KTH dataset and UCF sports dataset. Experimental results demonstrate that our approach outperforms the state-of-art techniques on KTH dataset and achieves the comparable performance on UCF sports dataset.
引用
收藏
页码:660 / +
页数:3
相关论文
共 50 条
  • [21] Human action recognition based on spatial-temporal descriptors using key poses
    Hu, Shuo
    Chen, Yuxin
    Wang, Huaibao
    Zuo, Yaqing
    INTERNATIONAL SYMPOSIUM ON OPTOELECTRONIC TECHNOLOGY AND APPLICATION 2014: IMAGE PROCESSING AND PATTERN RECOGNITION, 2014, 9301
  • [22] Human Action Recognition by Fusion of Convolutional Neural Networks and spatial-temporal Information
    Li, Weisheng
    Ding, Yahui
    8TH INTERNATIONAL CONFERENCE ON INTERNET MULTIMEDIA COMPUTING AND SERVICE (ICIMCS2016), 2016, : 255 - 259
  • [23] Action Recognition by Joint Spatial-Temporal Motion Feature
    Zhang, Weihua
    Zhang, Yi
    Gao, Chaobang
    Zhou, Jiliu
    JOURNAL OF APPLIED MATHEMATICS, 2013,
  • [24] Spatial-Temporal Separable Attention for Video Action Recognition
    Guo, Xi
    Hu, Yikun
    Chen, Fang
    Jin, Yuhui
    Qiao, Jian
    Huang, Jian
    Yang, Qin
    2022 INTERNATIONAL CONFERENCE ON FRONTIERS OF ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING, FAIML, 2022, : 224 - 228
  • [25] Spatial-Temporal Pyramid Graph Reasoning for Action Recognition
    Geng, Tiantian
    Zheng, Feng
    Hou, Xiaorong
    Lu, Ke
    Qi, Guo-Jun
    Shao, Ling
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 5484 - 5497
  • [26] Spatial-Temporal Convolutional Attention Network for Action Recognition
    Luo, Huilan
    Chen, Han
    Computer Engineering and Applications, 2023, 59 (09): : 150 - 158
  • [27] Action recognition with spatial-temporal discriminative filter banks
    Martinez, Brais
    Modolo, Davide
    Xiong, Yuanjun
    Tighe, Joseph
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 5481 - 5490
  • [28] Grouped Spatial-Temporal Aggregation for Efficient Action Recognition
    Luo, Chenxu
    Yuille, Alan
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 5511 - 5520
  • [29] Select and Focus: Action Recognition with Spatial-Temporal Attention
    Chan, Wensong
    Tian, Zhiqiang
    Liu, Shuai
    Ren, Jing
    Lan, Xuguang
    INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2019, PT III, 2019, 11742 : 461 - 471
  • [30] Spatial-Temporal Interleaved Network for Efficient Action Recognition
    Jiang, Shengqin
    Zhang, Haokui
    Qi, Yuankai
    Liu, Qingshan
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2025, 21 (01) : 178 - 187