Action Recognition Using Multiple Pooling Strategies of CNN Features

被引:0
|
作者
Haifeng Hu
Zhongke Liao
Xiang Xiao
机构
[1] Sun Yat-sen Univercity,School of Electronic and Information Engineering
来源
Neural Processing Letters | 2019年 / 50卷
关键词
Action recognition; Convolutional neural networks; Multiple pooling strategies;
D O I
暂无
中图分类号
学科分类号
摘要
The deep convolution neural network has shown great potential in the field of human action recognition. For the sake of obtaining compact and discriminative feature representation, this paper proposes multiple pooling strategies using CNN features. We explore three different pooling strategies, which are called space-time feature pooling (STFP), time filter pooling (TFP) and spatio-temporal pyramid pooling (STPP), respectively. STFP shares the advantages of both hand-crafted features and deep ConvNets features. TFP reflects the change of elements on each CNN feature map over time. STPP focuses on the spatial and temporal pyramid structure of the feature maps. We aggregate these pooled features to produce a new discriminative video descriptor. Experimental results show that the three strategies have complementary advantages on the challenging YouTube, UCF50 and UCF101 datasets, and our video representation is comparable to the previous state-of-the-art algorithms.
引用
收藏
页码:379 / 396
页数:17
相关论文
共 50 条
  • [31] Fast action recognition using negative space features
    Rahman, Shah Atiqur
    Song, Insu
    Leung, M. K. H.
    Lee, Ickjai
    Lee, Kyungmi
    EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (02) : 574 - 587
  • [32] Hierarchical Temporal Pooling for Efficient Online Action Recognition
    Zhang, Can
    Zou, Yuexian
    Chen, Guang
    MULTIMEDIA MODELING (MMM 2019), PT I, 2019, 11295 : 471 - 482
  • [33] AN IMPROVED METHOD USING KINEMATIC FEATURES FOR ACTION RECOGNITION
    Chen, Yuanbo
    Zhao, Yanyun
    Cai, Anni
    PROCEEDINGS OF 2011 INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY AND APPLICATION, ICCTA2011, 2011, : 737 - 741
  • [34] Second-order Temporal Pooling for Action Recognition
    Anoop Cherian
    Stephen Gould
    International Journal of Computer Vision, 2019, 127 : 340 - 362
  • [35] INVESTIGATION OF DIFFERENT SKELETON FEATURES FOR CNN-BASED 3D ACTION RECOGNITION
    Ding, Zewei
    Wang, Pichao
    Ogunbona, Philip O.
    Li, Wanqing
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2017,
  • [36] Embedded Features for 1D CNN-based Action Recognition on Depth Maps
    Trelinski, Jacek
    Kwolek, Bogdan
    VISAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 4: VISAPP, 2021, : 536 - 543
  • [37] DMMs-Based Multiple Features Fusion for Human Action Recognition
    Bulbul, Mohammad Farhad
    Jiang, Yunsheng
    Ma, Jinwen
    INTERNATIONAL JOURNAL OF MULTIMEDIA DATA ENGINEERING & MANAGEMENT, 2015, 6 (04) : 23 - 39
  • [38] Recognition of facial expressions based on CNN features
    Sonia M. González-Lozoya
    Jorge de la Calleja
    Luis Pellegrin
    Hugo Jair Escalante
    Ma. Auxilio Medina
    Antonio Benitez-Ruiz
    Multimedia Tools and Applications, 2020, 79 : 13987 - 14007
  • [39] Recognition of facial expressions based on CNN features
    Gonzalez-Lozoya, Sonia M.
    de la Calleja, Jorge
    Pellegrin, Luis
    Escalante, Hugo Jair
    Medina, Ma. Auxilio
    Benitez-Ruiz, Antonio
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (19-20) : 13987 - 14007
  • [40] Action Recognition Using 3D CNN and LSTM for Video Analytics
    Umamakeswari, A.
    Angelus, Jonah
    Kannan, Monicaa
    Rashikha
    Bragadeesh, S. A.
    INTELLIGENT COMPUTING AND COMMUNICATION, ICICC 2019, 2020, 1034 : 531 - 539