Evaluating Two-Stream CNN for Video Classification

被引:75
作者
Ye, Hao [1 ]
Wu, Zuxuan [1 ]
Zhao, Rui-Wei [1 ]
Wang, Xi [1 ]
Jiang, Yu-Gang [1 ]
Xue, Xiangyang [1 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Intelligent Informat Proc, Shanghai, Peoples R China
来源
ICMR'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL | 2015年
关键词
Video Classification; Deep Learning; CNN; Evaluation;
D O I
10.1145/2671188.2749406
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Videos contain very rich semantic information. Traditional hand-crafted features are known to be inadequate in analyzing complex video semantics. Inspired by the huge success of the deep learning methods in analyzing image, audio and text data, significant efforts are recently being devoted to the design of deep nets for video analytics. Among the many practical needs, classifying videos (or video clips) based on their major semantic categories (e.g., "skiing") is useful in many applications. In this paper, we conduct an in-depth study to investigate important implementation options that may affect the performance of deep nets on video classification. Our evaluations are conducted on top of a recent two stream convolutional neural network (CNN) pipeline, which uses both static frames and motion optical flows, and has demonstrated competitive performance against the state-ofthe-art methods. In order to gain insights and to arrive at a practical guideline, many important options are studied, including network architectures, model fusion, learning parameters and the final prediction methods. Based on the evaluations, very competitive results are attained on two popular video classification benchmarks. We hope that the discussions and conclusions from this work can help researchers in related fields to quickly set up a good basis for further investigations along this very promising direction.
引用
收藏
页码:435 / 442
页数:8
相关论文
共 35 条
[1]  
[Anonymous], ACM ICMR
[2]  
[Anonymous], 2011, ICCV
[3]  
[Anonymous], IEEE TPAMI
[4]  
[Anonymous], 2012, CVPR
[5]  
[Anonymous], 2014, NIPS
[6]  
[Anonymous], 2001, Intelligent Signal Processing
[7]  
[Anonymous], 2012, UCF101 DATASET 101 H
[8]  
[Anonymous], ACM MULTIMEDIA
[9]  
[Anonymous], 2014, ECCV THUMOS CHALLENG
[10]  
[Anonymous], 2013, ICCV