High-Quality R-CNN Object Detection Using Multi-Path Detection Calibration Network

被引:53
作者
Chen, Xiaoyu [1 ]
Li, Hongliang [1 ]
Wu, Qingbo [1 ]
Ngan, King Ngi [1 ]
Xu, Linfeng [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 611731, Peoples R China
基金
中国国家自然科学基金;
关键词
Convolutional neural networks (CNNs); deep learning; object detection; object recognition;
D O I
10.1109/TCSVT.2020.2987465
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Object proposals are used in two-stage detectors, such as R-CNN, to generate detection results, including category predictions and refined bounding-boxes. As a result, classification scores are assigned to refined bounding-boxes rather than object proposals. However, this procedure ignores the discrepancy of data distribution between object proposals and refined bounding-boxes. We consider this discrepancy could limit the detection accuracy. Specifically, the foreground/background imbalance on object proposals and inaccurate information from low-IoU proposals could hinder the category prediction. In this paper, we propose a detector called the Multi-Path Detection Calibration Network (PDC-Net) to address this problem. The key idea behind PDC-Net is calibrating detection results from R-CNN by considering the statistical discrepancy between object proposals and refined bounding-boxes. PDC-Net is built on Faster R-CNN. The core component in PDC-Net is the multi-path detection head, in which the base detector (from Faster R-CNN) generates detection results from object proposals and multiple calibration detectors fix incorrect outputs from the base detector using refined bounding-boxes. Experiments reveal that PDC-Net can boost detection results. Our method could reach 83.1% and 43.3% mAP respectively on PASCAL VOC and MSCOCO benchmarks, which is comparable to several state-of-the-art methods.
引用
收藏
页码:715 / 727
页数:13
相关论文
共 46 条
[1]  
[Anonymous], 2015, PROC ADVNEURAL INF P
[2]   Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks [J].
Bell, Sean ;
Zitnick, C. Lawrence ;
Bala, Kavita ;
Girshick, Ross .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2874-2883
[3]  
Berg A, 2017, ARXIV PREPRINT ARXIV
[4]   Revisiting RCNN: On Awakening the Classification Power of Faster RCNN [J].
Cheng, Bowen ;
Wei, Yunchao ;
Shi, Honghui ;
Feris, Rogerio ;
Xiong, Jinjun ;
Huang, Thomas .
COMPUTER VISION - ECCV 2018, PT 15, 2018, 11219 :473-490
[5]   Deformable Convolutional Networks [J].
Dai, Jifeng ;
Qi, Haozhi ;
Xiong, Yuwen ;
Li, Yi ;
Zhang, Guodong ;
Hu, Han ;
Wei, Yichen .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773
[6]  
Dai J, 2016, PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), P1796, DOI 10.1109/ICIT.2016.7475036
[7]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[8]   Fast Parameter Adaptation for Few-shot Image Captioning and Visual Question Answering [J].
Dong, Xuanyi ;
Zhu, Linchao ;
Zhang, De ;
Yang, Yi ;
Wu, Fei .
PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, :54-62
[9]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338
[10]   OSMO: Online Specific Models for Occlusion in Multiple Object Tracking under Surveillance Scene [J].
Gao, Xu ;
Jiang, Tingting .
PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, :201-210