Dynamic Refinement Network for Oriented and Densely Packed Object Detection

被引:299
作者
Pan, Xingjia [1 ,2 ]
Ren, Yuqiang [3 ]
Sheng, Kekai [3 ]
Dong, Weiming [1 ,2 ,4 ]
Yuan, Haolei [3 ]
Guo, Xiaowei [3 ]
Ma, Chongyang [5 ]
Xu, Changsheng [1 ,2 ,4 ]
机构
[1] Chinese Acad Sci, Inst Automat, NLPR, Beijing, Peoples R China
[2] UCAS, Sch Artificial Intelligence, Beijing, Peoples R China
[3] Tencent, Youtu Lab, Shenzhen, Peoples R China
[4] CASIA LLVis Joint Lab, Beijing, Peoples R China
[5] Kuaishou Technol, Y Tech, Beijing, Peoples R China
来源
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020) | 2020年
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
D O I
10.1109/CVPR42600.2020.01122
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Object detection has achieved remarkable progress in the past decade. However, the detection of oriented and densely packed objects remains challenging because of following inherent reasons: (1) receptive fields of neurons are all axis-aligned and of the same shape, whereas objects are usually of diverse shapes and align along various directions; (2) detection models are typically trained with generic knowledge and may not generalize well to handle specific objects at test time; (3) the limited dataset hinders the development on this task. To resolve the first two issues, we present a dynamic refinement network which consists of two novel components, i.e., a feature selection module (FSM) and a dynamic refinement head (DRH). Our FSM enables neurons to adjust receptive fields in accordance with the shapes and orientations of target objects, whereas the DRH empowers our model to refine the prediction dynamically in an object-aware manner. To address the limited availability of related benchmarks, we collect an extensive and fully annotated dataset, namely, SKU110K-R, which is relabeled with oriented bounding boxes based on SKU110K. We perform quantitative evaluations on several publicly available benchmarks including DOTA, HRSC2016, SKU110K, and our own SKU110K-R dataset. Experimental results show that our method achieves consistent and substantial gains compared with baseline approaches. Our source code and dataset will be released to encourage followup research.
引用
收藏
页码:11204 / 11213
页数:10
相关论文
共 47 条
[1]  
[Anonymous], LECT NOTES COMPUT SC
[2]   Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery [J].
Azimi, Seyed Majid ;
Vig, Eleonora ;
Bahmanyar, Reza ;
Koerner, Marco ;
Reinartz, Peter .
COMPUTER VISION - ACCV 2018, PT III, 2019, 11363 :150-165
[3]   Soft-NMS - Improving Object Detection With One Line of Code [J].
Bodla, Navaneeth ;
Singh, Bharat ;
Chellappa, Rama ;
Davis, Larry S. .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5562-5570
[4]  
Borji Ali, 2019, [Computational Visual Media, 计算可视媒体], V5, P117
[5]   Deformable Convolutional Networks [J].
Dai, Jifeng ;
Qi, Haozhi ;
Xiong, Yuwen ;
Li, Yi ;
Zhang, Guodong ;
Hu, Han ;
Wei, Yichen .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773
[6]   Learning RoI Transformer for Oriented Object Detection in Aerial Images [J].
Ding, Jian ;
Xue, Nan ;
Long, Yang ;
Xia, Gui-Song ;
Lu, Qikai .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :2844-2853
[7]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338
[8]   Rich feature hierarchies for accurate object detection and semantic segmentation [J].
Girshick, Ross ;
Donahue, Jeff ;
Darrell, Trevor ;
Malik, Jitendra .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587
[9]   Precise Detection in Densely Packed Scenes [J].
Goldman, Eran ;
Herzig, Roei ;
Eisenschtat, Aviv ;
Goldberger, Jacob ;
Hassner, Tal .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5222-5231
[10]   Drone-based Object Counting by Spatially Regularized Regional Proposal Network [J].
Hsieh, Meng-Ru ;
Lin, Yen-Liang ;
Hsu, Winston H. .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4165-4173