Residual Super-Resolution Single Shot Network for Low-Resolution Object Detection

被引:26
作者
Zhao, Xiaotong [1 ,2 ]
Li, Wei [3 ]
Zhang, Yifan [1 ,2 ]
Feng, Zhiyong [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Key Lab Universal Wireless Commun, Beijing 100876, Peoples R China
[2] Beijing Univ Posts & Telecommun, Res Inst, Shenzhen 518057, Peoples R China
[3] Northern Illinois Univ, Dept Elect Engn, De Kalb, IL 60115 USA
基金
中国国家自然科学基金;
关键词
Object detection; convolutional neural networks; image resolution; IMAGE SUPERRESOLUTION; RECOGNITION;
D O I
10.1109/ACCESS.2018.2867586
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
For object detection in computer vision, detection models trained by high-resolution images often fail to recognize or localize objects on low-resolution images. To tackle this problem, we propose a fully convolutional network named residual super-resolution single shot network (RSRSSN). RSRSSN consists of two sub-networks, super-resolution sub-network and detection sub-network. The super-resolution sub-network in RSRSSN is achieved by stacking of identity residual blocks while the detection sub-network adopts the single shot multibox detector (SSD). Based on multi-task learning, we design a novel objective function called feature maps multibox loss to enforce low-resolution images to produce similar feature maps with their corresponding high-resolution ones. This information sharing mechanism is proved to be critical for solving the resolution mismatch problem in the experiments. A two-step training scheme is also proposed to train the RSRSSN in an end-to-end manner. Without any data augmentation, RSRSSN outperforms the SSD on both down-sampled PASCAL VOC and MS COCO in real-time object detection.
引用
收藏
页码:47780 / 47793
页数:14
相关论文
共 51 条
[11]   Multitask learning [J].
Caruana, R .
MACHINE LEARNING, 1997, 28 (01) :41-75
[12]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[13]   Accelerating the Super-Resolution Convolutional Neural Network [J].
Dong, Chao ;
Loy, Chen Change ;
Tang, Xiaoou .
COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 :391-407
[14]   Image Super-Resolution Using Deep Convolutional Networks [J].
Dong, Chao ;
Loy, Chen Change ;
He, Kaiming ;
Tang, Xiaoou .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (02) :295-307
[15]  
DUCHON CE, 1979, J APPL METEOROL, V18, P1016, DOI 10.1175/1520-0450(1979)018<1016:LFIOAT>2.0.CO
[16]  
2
[17]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338
[18]   Object Detection with Discriminatively Trained Part-Based Models [J].
Felzenszwalb, Pedro F. ;
Girshick, Ross B. ;
McAllester, David ;
Ramanan, Deva .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (09) :1627-1645
[19]   Example-based super-resolution [J].
Freeman, WT ;
Jones, TR ;
Pasztor, EC .
IEEE COMPUTER GRAPHICS AND APPLICATIONS, 2002, 22 (02) :56-65
[20]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448