A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection

被引：1116

作者：

Cai, Zhaowei ^{[1
]}

Fan, Quanfu ^{[2
]}

Feris, Rogerio S. ^{[2
]}

Vasconcelos, Nuno ^{[1
]}

机构：

[1] Univ Calif San Diego, SVCL, San Diego, CA 92103 USA

[2] IBM TJ Watson Res, Yorktown Hts, NY USA

来源：

COMPUTER VISION - ECCV 2016, PT IV | 2016年 / 9908卷

基金：

美国国家科学基金会;

关键词：

Object detection; Multi-scale; Unified neural network;

D O I：

10.1007/978-3-319-46493-0_22

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A unified deep neural network, denoted the multi-scale CNN (MS-CNN), is proposed for fast multi-scale object detection. The MS-CNN consists of a proposal sub-network and a detection sub-network. In the proposal sub-network, detection is performed at multiple output layers, so that receptive fields match objects of different scales. These complementary scale-specific detectors are combined to produce a strong multi-scale object detector. The unified network is learned end-to-end, by optimizing a multi-task loss. Feature upsampling by deconvolution is also explored, as an alternative to input upsampling, to reduce the memory and computation costs. State-of-the-art object detection performance, at up to 15 fps, is reported on datasets, such as KITTI and Caltech, containing a substantial number of small objects.

引用

页码：354 / 370

页数：17

共 43 条

[1]

[Anonymous], 2014, P 27 INT C NEURAL IN

[2]

[Anonymous], 2016, CVPR

[3] Multiscale Combinatorial Grouping [J].

Arbelaez, Pablo ;

Pont-Tuset, Jordi ;

Barron, Jonathan T. ;

Marques, Ferran ;

Malik, Jitendra .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :328-335

[4] Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks [J].

Bell, Sean ;

Zitnick, C. Lawrence ;

Bala, Kavita ;

Girshick, Ross .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2874-2883

[5]

Benenson R, 2012, PROC CVPR IEEE, P2903, DOI 10.1109/CVPR.2012.6248017

[6] Robust object detection via soft cascade [J].

Bourdev, L ;

Brandt, J .

2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 2, PROCEEDINGS, 2005, :236-243

[7] Learning Complexity-Aware Cascades for Deep Pedestrian Detection [J].

Cai, Zhaowei ;

Saberian, Mohammad ;

Vasconcelos, Nuno .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :3361-3369

[8]

Chen XZ, 2015, ADV NEUR IN, V28

[9] BING: Binarized Normed Gradients for Objectness Estimation at 300fps [J].

Cheng, Ming-Ming ;

Zhang, Ziming ;

Lin, Wen-Yan ;

Torr, Philip .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :3286-3293

[10] Histograms of oriented gradients for human detection [J].

Dalal, N ;

Triggs, B .

2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893

← 1 2 3 4 5 →