End-to-end Video-level Representation Learning for Action Recognition

被引：0

作者：

Zhu, Jiagang ^{[1
,2
]}

Zhu, Zheng ^{[1
,2
]}

Zou, Wei ^{[1
,3
]}

机构：

[1] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Beijing, Peoples R China

[3] CASIA Co Ltd, TianJin Intelligent Tech Inst, Beijing, Peoples R China

来源：

2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) | 2018年

基金：

中国国家自然科学基金; 国家高技术研究发展计划(863计划);

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

From the frame/clip-level feature learning to the video-level representation building, deep learning methods in action recognition have developed rapidly in recent years. However, current methods suffer from the confusion caused by partial observation training, or without end-to-end learning, or restricted to single temporal scale modeling and so on. In this paper, we build upon two-stream ConvNets and propose Deep networks with Temporal Pyramid Pooling (DTPP), an end-to-end video-level representation learning approach, to address these problems. Specifically, at first, RGB images and optical flow stacks are sparsely sampled across the whole video. Then a temporal pyramid pooling layer is used to aggregate the frame-level features which consist of spatial and temporal cues. Lastly, the trained model has compact video-level representation with multiple temporal scales, which is both global and sequence-aware. Experimental results show that DTPP achieves the state-of-the-art performance on two challenging video action datasets: UCF101 and HMDB51, either by ImageNet pre-training or Kinetics pre-training.

引用

页码：645 / 650

页数：6

共 34 条

[1]

[Anonymous], ARXIV170805465V1

[2]

[Anonymous], ARXIV170903655

[3]

[Anonymous], 2015, ICCV

[4]

[Anonymous], ARXIV170502953V1

[5]

[Anonymous], 2016, ECCV

[6]

[Anonymous], ARXIV150301224V2

[7]

[Anonymous], 2016, IJCV

[8]

[Anonymous], 2015, P IEEE INT C COMPUTE

[9]

[Anonymous], 2015, CVPR

[10]

[Anonymous], 2014, NIPS

← 1 2 3 4 →