Jointly Learning Deep Features, Deformable Parts, Occlusion and Classification for Pedestrian Detection

被引:110
作者
Ouyang, Wanli [1 ,2 ]
Zhou, Hui [2 ]
Li, Hongsheng [2 ]
Li, Quanquan [2 ]
Yan, Junjie [3 ]
Wang, Xiaogang [2 ]
机构
[1] Univ Sydney, Sch Elect & Informat Engn, Camperdown, NSW 2006, Australia
[2] Chinese Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China
[3] SenseTime Grp Ltd, Shatin, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
CNN; convolutional neural networks; object detection; deep learning; deep model; SINGLE; IMAGE; MODEL;
D O I
10.1109/TPAMI.2017.2738645
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature extraction, deformation handling, occlusion handling, and classification are four important components in pedestrian detection. Existing methods learn or design these components either individually or sequentially. The interaction among these components is not yet well explored. This paper proposes that they should be jointly learned in order to maximize their strengths through cooperation. We formulate these four components into a joint deep learning framework and propose a new deep network architecture (Code available on www.ee.cuhk.edu.hk/wlouyang/projects/ouyangWiccv13Joint/index.html). By establishing automatic, mutual interaction among components, the deep model has average miss rate 8.57 percent/11.71 percent on the Caltech benchmark dataset with new/original annotations.
引用
收藏
页码:1874 / 1887
页数:14
相关论文
共 94 条
[1]  
[Anonymous], 2014, P 27 INT C NEURAL IN
[2]  
[Anonymous], 2016, IEEE T PATTERN ANAL, DOI DOI 10.1109/TPAMI.2015.2474388
[3]  
[Anonymous], 2015, P BRIT MACH VIS C
[4]  
[Anonymous], P IEEE INT C COMP VI
[5]  
[Anonymous], 2014, Advances in Neural Information Processing Systems
[6]  
Bar-Hillel A., DEEPVISUALIZATION ON
[7]  
Bar-Hillel A, 2010, LECT NOTES COMPUT SC, V6314, P127, DOI 10.1007/978-3-642-15561-1_10
[8]  
Barinova O., 2010, P IEEE C COMP VIS PA, P1773
[9]   Ten Years of Pedestrian Detection, What Have We Learned? [J].
Benenson, Rodrigo ;
Omran, Mohamed ;
Hosang, Jan ;
Schiele, Bernt .
COMPUTER VISION - ECCV 2014 WORKSHOPS, PT II, 2015, 8926 :613-627
[10]   Learning Deep Architectures for AI [J].
Bengio, Yoshua .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01) :1-127