Action Recognition Using Multiple Pooling Strategies of CNN Features

被引:0
|
作者
Haifeng Hu
Zhongke Liao
Xiang Xiao
机构
[1] Sun Yat-sen Univercity,School of Electronic and Information Engineering
来源
Neural Processing Letters | 2019年 / 50卷
关键词
Action recognition; Convolutional neural networks; Multiple pooling strategies;
D O I
暂无
中图分类号
学科分类号
摘要
The deep convolution neural network has shown great potential in the field of human action recognition. For the sake of obtaining compact and discriminative feature representation, this paper proposes multiple pooling strategies using CNN features. We explore three different pooling strategies, which are called space-time feature pooling (STFP), time filter pooling (TFP) and spatio-temporal pyramid pooling (STPP), respectively. STFP shares the advantages of both hand-crafted features and deep ConvNets features. TFP reflects the change of elements on each CNN feature map over time. STPP focuses on the spatial and temporal pyramid structure of the feature maps. We aggregate these pooled features to produce a new discriminative video descriptor. Experimental results show that the three strategies have complementary advantages on the challenging YouTube, UCF50 and UCF101 datasets, and our video representation is comparable to the previous state-of-the-art algorithms.
引用
收藏
页码:379 / 396
页数:17
相关论文
共 50 条
  • [21] Action recognition using polyhedron neighborhood features
    Yang, Jiangfeng
    Ma, Zheng
    International Journal of Multimedia and Ubiquitous Engineering, 2015, 10 (01): : 391 - 402
  • [22] Multilayer deep features with multiple kernel learning for action recognition
    Sheng, Biyun
    Li, Jun
    Xiao, Fu
    Yang, Wankou
    NEUROCOMPUTING, 2020, 399 : 65 - 74
  • [23] Fusing Multiple Features for Depth-Based Action Recognition
    Zhu, Yu
    Chen, Wenbin
    Guo, Guodong
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2015, 6 (02)
  • [24] LEARNING GEOMETRIC FEATURES WITH DUAL - STREAM CNN FOR 3D ACTION RECOGNITION
    Thien Huynh-The
    Hua, Cam-Hao
    Nguyen Anh Tu
    Kim, Dong-Seong
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 2353 - 2357
  • [25] FACE RECOGNITION BY LANDMARK POOLING-BASED CNN WITH CONCENTRATE LOSS
    Huang, Rui
    Xie, Xiaohua
    Feng, Zhanxiang
    Lai, Jianhuang
    2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 1582 - 1586
  • [26] Features for Action Recognition
    Le T.
    Duc N.H.
    Nguyen C.T.
    Tran M.T.
    Informatica (Slovenia), 2023, 47 (03): : 327 - 334
  • [27] Aggressive action recognition using 3D CNN architectures
    Saveliev, Anton
    Uzdiaev, Mikhail
    Dmitrii, Malov
    12TH INTERNATIONAL CONFERENCE ON THE DEVELOPMENTS IN ESYSTEMS ENGINEERING (DESE 2019), 2019, : 890 - 895
  • [28] Spatial-temporal pooling for action recognition in videos
    Wang, Jiaming
    Shao, Zhenfeng
    Huang, Xiao
    Lu, Tao
    Zhang, Ruiqian
    Lv, Xianwei
    NEUROCOMPUTING, 2021, 451 : 265 - 278
  • [29] Second-order Temporal Pooling for Action Recognition
    Cherian, Anoop
    Gould, Stephen
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2019, 127 (04) : 340 - 362
  • [30] Action Recognition Using Mined Hierarchical Compound Features
    Gilbert, Andrew
    Illingworth, John
    Bowden, Richard
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (05) : 883 - 897