Detecting Human Action as the Spatio-Temporal Tube of Maximum Mutual Information

被引:18
|
作者
Wang, Taiqing [1 ,2 ]
Wang, Shengjin [1 ,2 ]
Ding, Xiaoqing [1 ,2 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China
[2] Tsinghua Natl Lab Informat Sci & Technol, Beijing 100084, Peoples R China
基金
中国国家自然科学基金; 国家高技术研究发展计划(863计划);
关键词
Action detection; feature trajectory; mutual information; spatio-temporal cuboid (ST-cuboid); spatio-temporal tube (ST-tube); RECOGNITION; MOTION; DENSE;
D O I
10.1109/TCSVT.2013.2276856
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Human action detection in complex scenes is a challenging problem due to its high-dimensional search space and dynamic backgrounds. To achieve efficient and accurate action detection, we represent a video sequence as a collection of feature trajectories and model human action as the spatio-temporal tube (ST-tube) of maximum mutual information. First, a random forest is built to evaluate the mutual information of feature trajectories toward the action class, and then a one-order Markov model is introduced to recursively infer the action regions at consecutive frames. By exploring the time-continuity property of feature trajectories, the action region is efficiently inferred at large temporal intervals. Finally, we obtain an ST-tube by concatenating the consecutive action regions bounding the human bodies. Compared with the popular spatio-temporal cuboid action model, the proposed ST-tube model is not only more efficient, but also more accurate in action localization. Experimental results on the KTH, CMU and UCF sports datasets validate the superiority of our approach over the state-of-the-art methods in both localization accuracy and time efficiency.
引用
收藏
页码:277 / 290
页数:14
相关论文
共 50 条
  • [1] Spatio-temporal information for human action recognition
    Yao, Li
    Liu, Yunjian
    Huang, Shihui
    EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2016,
  • [2] Spatio-temporal information for human action recognition
    Li Yao
    Yunjian Liu
    Shihui Huang
    EURASIP Journal on Image and Video Processing, 2016
  • [3] Spatio-Temporal Information Fusion and Filtration for Human Action Recognition
    Zhang, Man
    Li, Xing
    Wu, Qianhan
    SYMMETRY-BASEL, 2023, 15 (12):
  • [4] Mutually Reinforced Spatio-Temporal Convolutional Tube for Human Action Recognition
    Wu, Haoze
    Liu, Jiawei
    Zha, Zheng-Jun
    Chen, Zhenzhong
    Sun, Xiaoyan
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 968 - 974
  • [5] Silhouette analysis for human action recognition based on maximum spatio-temporal dissimilarity embedding
    Jian Cheng
    Haijun Liu
    Hongsheng Li
    Machine Vision and Applications, 2014, 25 : 1007 - 1018
  • [6] Silhouette analysis for human action recognition based on maximum spatio-temporal dissimilarity embedding
    Cheng, Jian
    Liu, Haijun
    Li, Hongsheng
    MACHINE VISION AND APPLICATIONS, 2014, 25 (04) : 1007 - 1018
  • [7] Spatio-Temporal Steerable Pyramid for Human Action Recognition
    Zhen, Xiantong
    Shao, Ling
    2013 10TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), 2013,
  • [8] Spatio-temporal Video Autoencoder for Human Action Recognition
    Sousa e Santos, Anderson Carlos
    Pedrini, Helio
    PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2019, : 114 - 123
  • [9] Spatio-temporal human action localization in indoor surveillances
    Liu, Zihao
    Yan, Danfeng
    Cai, Yuanqiang
    Song, Yan
    Pattern Recognition, 2024, 147
  • [10] Spatio-temporal human action localization in indoor surveillances
    Liu, Zihao
    Yan, Danfeng
    Cai, Yuanqiang
    Song, Yan
    PATTERN RECOGNITION, 2024, 147