PBG-NET: OBJECT DETECTION WITH A MULTI-FEATURE AND ITERATIVE CNN MODEL

被引：0

作者：

Lou, Yingxin ^{[1
]}

Fu, Guangtao ^{[2
]}

Jiang, Zhuqing ^{[1
]}

Men, Aidong ^{[1
]}

Zhou, Yun ^{[2
]}

机构：

[1] Beijing Univ Posts & Telecommun, Beijing, Peoples R China

[2] Acad Broadcasting Sci, Beijing, Peoples R China

来源：

2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW) | 2017年

基金：

美国国家科学基金会;

关键词：

Object detection; convolutional neural network; predicting boxes generation; multi-feature; iterative localization;

D O I：

暂无

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

We introduce PBG-Net, an object detection system based on an elaborately designed multi-feature deep CNN which works without proposal algorithms. Firstly, PBG-Net aggregates hierarchical features into multi-feature maps and discretizes the output of Conv5 feature map into a set of predicting boxes, namely Predicting Boxes Generation (PBG). Then, PBG-Net crops multi-feature maps via mapping the predicting boxes and handles the outcome into multi-feature concatenation. Finally, we exploit an iterative regression localization model based on a novel overlap loss function and online hard boxes selection. PBG-Net with around 100 boxes and an end-to-end joint training can achieve 74.2% and 71.1% mAP on the detection of PASCAL VOC 2007 and PASCAL VOC 2012 correspondingly at 12 fps on a NVIDIA GTX 1070p GPU, better than the Faster R-CNN counterparts.

引用

页数：6

共 24 条

[21] Pedestrian Detection with Unsupervised Multi-Stage Feature Learning [J].

Sermanet, Pierre ;

Kavukcuoglu, Koray ;

Chintala, Soumith ;

LeCun, Yann .

2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :3626-3633

[22]

Simonyan K., 2014, 14091556 ARXIV, DOI DOI 10.1016/J.INFSOF.2008.09.005

[23] Selective Search for Object Recognition [J].

Uijlings, J. R. R. ;

van de Sande, K. E. A. ;

Gevers, T. ;

Smeulders, A. W. M. .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2013, 104 (02) :154-171

[24] Unsupervised Learning of Visual Representations using Videos [J].

Wang, Xiaolong ;

Gupta, Abhinav .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2794-2802

← 1 2 3 →