Action Recognition Using Multiple Pooling Strategies of CNN Features

被引：0

作者：

Haifeng Hu

Zhongke Liao

Xiang Xiao

机构：

[1] Sun Yat-sen Univercity,School of Electronic and Information Engineering

来源：

Neural Processing Letters | 2019年 / 50卷

关键词：

Action recognition; Convolutional neural networks; Multiple pooling strategies;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The deep convolution neural network has shown great potential in the field of human action recognition. For the sake of obtaining compact and discriminative feature representation, this paper proposes multiple pooling strategies using CNN features. We explore three different pooling strategies, which are called space-time feature pooling (STFP), time filter pooling (TFP) and spatio-temporal pyramid pooling (STPP), respectively. STFP shares the advantages of both hand-crafted features and deep ConvNets features. TFP reflects the change of elements on each CNN feature map over time. STPP focuses on the spatial and temporal pyramid structure of the feature maps. We aggregate these pooled features to produce a new discriminative video descriptor. Experimental results show that the three strategies have complementary advantages on the challenging YouTube, UCF50 and UCF101 datasets, and our video representation is comparable to the previous state-of-the-art algorithms.

引用

页码：379 / 396

页数：17

共 50 条

[21] Action recognition using polyhedron neighborhood features
Yang, Jiangfeng
Ma, Zheng
International Journal of Multimedia and Ubiquitous Engineering, 2015, 10 (01): : 391 - 402
[22] Multilayer deep features with multiple kernel learning for action recognition
Sheng, Biyun
Li, Jun
Xiao, Fu
Yang, Wankou
NEUROCOMPUTING, 2020, 399 : 65 - 74
[23] Fusing Multiple Features for Depth-Based Action Recognition
Zhu, Yu
Chen, Wenbin
Guo, Guodong
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2015, 6 (02)
[24] LEARNING GEOMETRIC FEATURES WITH DUAL - STREAM CNN FOR 3D ACTION RECOGNITION
Thien Huynh-The
Hua, Cam-Hao
Nguyen Anh Tu
Kim, Dong-Seong
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 2353 - 2357
[25] FACE RECOGNITION BY LANDMARK POOLING-BASED CNN WITH CONCENTRATE LOSS
Huang, Rui
Xie, Xiaohua
Feng, Zhanxiang
Lai, Jianhuang
2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 1582 - 1586
[26] Features for Action Recognition
Le T.
Duc N.H.
Nguyen C.T.
Tran M.T.
Informatica (Slovenia), 2023, 47 (03): : 327 - 334
[27] Aggressive action recognition using 3D CNN architectures
Saveliev, Anton
Uzdiaev, Mikhail
Dmitrii, Malov
12TH INTERNATIONAL CONFERENCE ON THE DEVELOPMENTS IN ESYSTEMS ENGINEERING (DESE 2019), 2019, : 890 - 895
[28] Spatial-temporal pooling for action recognition in videos
Wang, Jiaming
Shao, Zhenfeng
Huang, Xiao
Lu, Tao
Zhang, Ruiqian
Lv, Xianwei
NEUROCOMPUTING, 2021, 451 : 265 - 278
[29] Second-order Temporal Pooling for Action Recognition
Cherian, Anoop
Gould, Stephen
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2019, 127 (04) : 340 - 362
[30] Action Recognition Using Mined Hierarchical Compound Features
Gilbert, Andrew
Illingworth, John
Bowden, Richard
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (05) : 883 - 897

← 1 2 3 4 5 →