Joint Training of Cascaded CNN for Face Detection

被引:54
作者
Qin, Hongwei [1 ,2 ]
Yan, Junjie [3 ]
Li, Xiu [1 ,2 ]
Hu, Xiaolin [3 ]
机构
[1] Tsinghua Univ, Grad Sch Shenzhen, Beijing, Peoples R China
[2] Tsinghua Univ, Dept Automat, Beijing, Peoples R China
[3] Tsinghua Univ, Dept Comp Sci & Technol, Beijing, Peoples R China
来源
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2016年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR.2016.376
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cascade has been widely used in face detection, where classifier with low computation cost can be firstly used to shrink most of the background while keeping the recall. The cascade in detection is popularized by seminal Viola-Jones framework and then widely used in other pipelines, such as DPM and CNN. However, to our best knowledge, most of the previous detection methods use cascade in a greedy manner, where previous stages in cascade are fixed when training a new stage. So optimizations of different CNNs are isolated. In this paper, we propose joint training to achieve end-to-end optimization for CNN cascade. We show that the back propagation algorithm used in training CNN can be naturally used in training CNN cascade. We present how jointly training can be conducted on naive CNN cascade and more sophisticated region proposal network (RPN) and fast R-CNN. Experiments on face detection benchmarks verify the advantages of the joint training.
引用
收藏
页码:3456 / 3465
页数:10
相关论文
共 39 条
[11]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[12]   Multi-view Face Detection Using Deep Convolutional Neural Networks [J].
Farfade, Sachin Sudhakar ;
Saberian, Mohammad ;
Li, Li-Jia .
ICMR'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2015, :643-650
[13]   Object Detection with Discriminatively Trained Part-Based Models [J].
Felzenszwalb, Pedro F. ;
Girshick, Ross B. ;
McAllester, David ;
Ramanan, Deva .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (09) :1627-1645
[14]  
Ghiasi Golnaz, 2015, ARXIV150608347
[15]  
Girshick, 2015, P IEEE INT C COMP VI, DOI [10.1109/ICCV.2015.169, DOI 10.1109/ICCV.2015.169]
[16]  
Girshick R., 2014, IEEE C COMP VIS PATT, DOI [DOI 10.1109/CVPR.2014.81, 10.1109/CVPR.2014.81]
[17]   High-performance rotation invariant multiview face detection [J].
Huang, Chang ;
Ai, Haizhou ;
Li, Yuan ;
Lao, Shihong .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (04) :671-686
[18]  
Huang L., 2015, DenseBox: Unifying landmark localization with end to end object detection
[19]  
Koestinger Martin, 2011, PROC INT C COMPUTER, P2144, DOI DOI 10.1109/ICCVW.2011.6130513
[20]  
Li HX, 2015, PROC CVPR IEEE, P5325, DOI 10.1109/CVPR.2015.7299170