IoU-uniform R-CNN: Breaking through the limitations of RPN

被引:41
作者
Zhu, Li [1 ]
Xie, Zihao [1 ]
Liu, Liman [4 ]
Tao, Bo [3 ]
Tao, Wenbing [1 ,2 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Natl Key Lab Sci & Technol Multispectral Informat, Wuhan 430074, Peoples R China
[2] Shenzhen Huazhong Univ Sci & Technol, Res Inst, Shenzhen 518057, Peoples R China
[3] Huazhong Univ Sci & Technol, Sch Mech Sci & Engn, State Key Lab Digital Mfg Equipment & Technol, Wuhan 430074, Hubei, Peoples R China
[4] South Cent Univ Nationalities, Sch Biomed Engn, Wuhan 430074, Peoples R China
基金
中国国家自然科学基金;
关键词
Object detection; Two-stage detector; RPN; IoU distribution imbalance; FASTER;
D O I
10.1016/j.patcog.2021.107816
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Region Proposal Network (RPN) is the cornerstone of two-stage object detectors. It generates a sparse set of object proposals and alleviates the extrem foreground-background class imbalance problem during training. However, we find that the potential of the detector has not been fully exploited due to the IoU distribution imbalance and inadequate quantity of the training samples generated by RPN. With the increasing intersection over union (IoU), the exponentially smaller numbers of positive samples would lead to the distribution skewed towards lower IoUs, which hinders the optimization of detector at high IoU levels. In this paper, to break through the limitations of RPN, we propose IoU-Uniform R-CNN, a simple but effective method that directly generates training samples with uniform IoU distribution for the regression branch as well as the IoU prediction branch. Besides, we improve the performance of IoU prediction branch by eliminating the feature offsets of RoIs at inference, which helps the NMS procedure by preserving accurately localized bounding box. Extensive experiments on the PASCAL VOC and MS COCO dataset show the effectiveness of our method, as well as its compatibility and adaptivity to many object detection architectures. The code is made publicly available at https://github.com/zl1994/IoU-Uniform-R-CNN. (c) 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页数:11
相关论文
共 46 条
[1]   Soft-NMS - Improving Object Detection With One Line of Code [J].
Bodla, Navaneeth ;
Singh, Bharat ;
Chellappa, Rama ;
Davis, Larry S. .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5562-5570
[2]  
Cao Y., 2019, ARXIV190404821
[3]  
Chen K., 2019, arXiv:1906.07155
[4]   Robust one-stage object detection with location-aware classifiers [J].
Chen, Qiang ;
Wang, Peisong ;
Cheng, Anda ;
Wang, Wanguo ;
Zhang, Yifan ;
Cheng, Jian .
PATTERN RECOGNITION, 2020, 105
[5]   Deformable Convolutional Networks [J].
Dai, Jifeng ;
Qi, Haozhi ;
Xiong, Yuwen ;
Li, Yi ;
Zhang, Guodong ;
Hu, Han ;
Wei, Yichen .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773
[6]  
Dai J, 2016, PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), P1796, DOI 10.1109/ICIT.2016.7475036
[7]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[8]   CenterNet: Keypoint Triplets for Object Detection [J].
Duan, Kaiwen ;
Bai, Song ;
Xie, Lingxi ;
Qi, Honggang ;
Huang, Qingming ;
Tian, Qi .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6568-6577
[9]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338
[10]  
Gevers T., 2013, INTERNATIONAL JOURNAL OF COMPUTER VISION, V104, P154, DOI [DOI 10.1007/S11263-013-0620-5, 10.1007/s11263-013-0620-5]