PBG-NET: OBJECT DETECTION WITH A MULTI-FEATURE AND ITERATIVE CNN MODEL

被引:0
作者
Lou, Yingxin [1 ]
Fu, Guangtao [2 ]
Jiang, Zhuqing [1 ]
Men, Aidong [1 ]
Zhou, Yun [2 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing, Peoples R China
[2] Acad Broadcasting Sci, Beijing, Peoples R China
来源
2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW) | 2017年
基金
美国国家科学基金会;
关键词
Object detection; convolutional neural network; predicting boxes generation; multi-feature; iterative localization;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We introduce PBG-Net, an object detection system based on an elaborately designed multi-feature deep CNN which works without proposal algorithms. Firstly, PBG-Net aggregates hierarchical features into multi-feature maps and discretizes the output of Conv5 feature map into a set of predicting boxes, namely Predicting Boxes Generation (PBG). Then, PBG-Net crops multi-feature maps via mapping the predicting boxes and handles the outcome into multi-feature concatenation. Finally, we exploit an iterative regression localization model based on a novel overlap loss function and online hard boxes selection. PBG-Net with around 100 boxes and an end-to-end joint training can achieve 74.2% and 71.1% mAP on the detection of PASCAL VOC 2007 and PASCAL VOC 2012 correspondingly at 12 fps on a NVIDIA GTX 1070p GPU, better than the Faster R-CNN counterparts.
引用
收藏
页数:6
相关论文
共 24 条
  • [21] Pedestrian Detection with Unsupervised Multi-Stage Feature Learning
    Sermanet, Pierre
    Kavukcuoglu, Koray
    Chintala, Soumith
    LeCun, Yann
    [J]. 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 3626 - 3633
  • [22] Simonyan K., 2014, 14091556 ARXIV, DOI DOI 10.1016/J.INFSOF.2008.09.005
  • [23] Selective Search for Object Recognition
    Uijlings, J. R. R.
    van de Sande, K. E. A.
    Gevers, T.
    Smeulders, A. W. M.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2013, 104 (02) : 154 - 171
  • [24] Unsupervised Learning of Visual Representations using Videos
    Wang, Xiaolong
    Gupta, Abhinav
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2794 - 2802