D2Det: Towards High Quality Object Detection and Instance Segmentation

被引:136
作者
Cao, Jiale [1 ]
Cholakkal, Hisham [2 ]
Anwer, Rao Muhammad [2 ]
Khan, Fahad Shahbaz [2 ]
Pang, Yanwei [1 ]
Shao, Ling [2 ]
机构
[1] Tianjin Univ, Tianjin, Peoples R China
[2] Incept Inst Artificial Intelligence IIAI, Abu Dhabi, U Arab Emirates
来源
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020) | 2020年
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
D O I
10.1109/CVPR42600.2020.01150
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a novel two-stage detection method, D2Det, that collectively addresses both precise localization and accurate classification. For precise localization, we introduce a dense local regression that predicts multiple dense box offsets for an object proposal. Different from traditional regression and keypoint-based localization employed in two-stage detectors, our dense local regression is not limited to a quantized set of keypoints within a fixed region and has the ability to regress position-sensitive real number dense offsets, leading to more precise localization. The dense local regression is further improved by a binary overlap prediction strategy that reduces the influence of background region on the final box regression. For accurate classification, we introduce a discriminative RoI pooling scheme that samples from various sub-regions of a proposal and performs adaptive weighting to obtain discriminative features. On MS COCO test-dev, our D2Det outperforms existing two-stage methods, with a single-model performance of 45.4 AP, using ResNet101 backbone. When using multi-scale training and inference, D2Det obtains AP of 50.1. In addition to detection, we adapt D2Det for instance segmentation, achieving a mask AP of 40.2 with a two-fold speedup, compared to the state-of-the-art. We also demonstrate the effectiveness of our D2Det on airborne sensors by performing experiments for object detection in UAV images (UAVDT dataset) and instance segmentation in satellite images (iSAID dataset). Source code is available at https://github.com/JialeCao001/D2Det.
引用
收藏
页码:11482 / 11491
页数:10
相关论文
共 54 条
[41]   Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks [J].
Ren, Shaoqing ;
He, Kaiming ;
Girshick, Ross ;
Sun, Jian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (06) :1137-1149
[42]  
Singh Bharat, 2018, P ADV NEURAL FORMATI
[43]   FCOS: Fully Convolutional One-Stage Object Detection [J].
Tian, Zhi ;
Shen, Chunhua ;
Chen, Hao ;
He, Tong .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9626-9635
[44]   Region Proposal by Guided Anchoring [J].
Wang, Jiaqi ;
Chen, Kai ;
Yang, Shuo ;
Loy, Chen Change ;
Lin, Dahua .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :2960-2969
[45]   Spatial Attentive Single-Image Deraining with a High Quality Real Rain Dataset [J].
Wang, Tianyu ;
Yang, Xin ;
Xu, Ke ;
Chen, Shaozhe ;
Zhang, Qiang ;
Lau, Rynson W. H. .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :12262-12271
[46]   Semantic Stereo Matching with Pyramid Cost Volumes [J].
Wu, Zhenyao ;
Wu, Xinyi ;
Zhang, Xiaoping ;
Wang, Song ;
Ju, Lili .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :7483-7492
[47]  
Xu Hang, 2019, P IEEE INT C COMP VI
[48]  
Xu RX, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020): SYSTEM DEMONSTRATIONS, P1
[49]   RepPoints: Point Set Representation for Object Detection [J].
Yang, Ze ;
Liu, Shaohui ;
Hu, Han ;
Wang, Liwei ;
Lin, Stephen .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9656-9665
[50]  
Zamir Syed Waqas, 2019, P IEEE C COMP VIS PA