D2Det: Towards High Quality Object Detection and Instance Segmentation

被引:136
作者
Cao, Jiale [1 ]
Cholakkal, Hisham [2 ]
Anwer, Rao Muhammad [2 ]
Khan, Fahad Shahbaz [2 ]
Pang, Yanwei [1 ]
Shao, Ling [2 ]
机构
[1] Tianjin Univ, Tianjin, Peoples R China
[2] Incept Inst Artificial Intelligence IIAI, Abu Dhabi, U Arab Emirates
来源
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020) | 2020年
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
D O I
10.1109/CVPR42600.2020.01150
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a novel two-stage detection method, D2Det, that collectively addresses both precise localization and accurate classification. For precise localization, we introduce a dense local regression that predicts multiple dense box offsets for an object proposal. Different from traditional regression and keypoint-based localization employed in two-stage detectors, our dense local regression is not limited to a quantized set of keypoints within a fixed region and has the ability to regress position-sensitive real number dense offsets, leading to more precise localization. The dense local regression is further improved by a binary overlap prediction strategy that reduces the influence of background region on the final box regression. For accurate classification, we introduce a discriminative RoI pooling scheme that samples from various sub-regions of a proposal and performs adaptive weighting to obtain discriminative features. On MS COCO test-dev, our D2Det outperforms existing two-stage methods, with a single-model performance of 45.4 AP, using ResNet101 backbone. When using multi-scale training and inference, D2Det obtains AP of 50.1. In addition to detection, we adapt D2Det for instance segmentation, achieving a mask AP of 40.2 with a two-fold speedup, compared to the state-of-the-art. We also demonstrate the effectiveness of our D2Det on airborne sensors by performing experiments for object detection in UAV images (UAVDT dataset) and instance segmentation in satellite images (iSAID dataset). Source code is available at https://github.com/JialeCao001/D2Det.
引用
收藏
页码:11482 / 11491
页数:10
相关论文
共 54 条
[1]   Soft-NMS - Improving Object Detection With One Line of Code [J].
Bodla, Navaneeth ;
Singh, Bharat ;
Chellappa, Rama ;
Davis, Larry S. .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5562-5570
[2]   Triply Supervised Decoder Networks for Joint Detection and Segmentation [J].
Cao, Jiale ;
Pang, Yanwei ;
Li, Xuelong .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7384-7393
[3]  
Cao Jiale, 2019, P IEEE INT C COMP VI
[4]   Phase 2a Safety, Pharmacokinetics, and Acceptability of Dapivirine Vaginal Rings in US Postmenopausal Women [J].
Chen, Beatrice A. ;
Zhang, Jingyang ;
Gundacker, Holly M. ;
Hendrix, Craig W. ;
Hoesley, Craig J. ;
Salata, Robert A. ;
Dezzutti, Charlene S. ;
van der Straten, Ariane ;
Hall, Wayne B. ;
Jacobson, Cindy E. ;
Johnson, Sherri ;
McGowan, Ian ;
Nel, Annalene M. ;
Soto-Torres, Lydia ;
Marzinke, Mark A. .
CLINICAL INFECTIOUS DISEASES, 2019, 68 (07) :1144-1151
[5]   Hybrid Task Cascade for Instance Segmentation [J].
Chen, Kai ;
Pang, Jiangmiao ;
Wang, Jiaqi ;
Xiong, Yu ;
Li, Xiaoxiao ;
Sun, Shuyang ;
Feng, Wansen ;
Liu, Ziwei ;
Shi, Jianping ;
Ouyang, Wanli ;
Loy, Chen Change ;
Lin, Dahua .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4969-4978
[6]   CaMap: Camera-based Map Manipulation on Mobile Devices [J].
Chen, Liang ;
Chen, Dongyi .
PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE2018), 2018,
[7]   Deformable Convolutional Networks [J].
Dai, Jifeng ;
Qi, Haozhi ;
Xiong, Yuwen ;
Li, Yi ;
Zhang, Guodong ;
Hu, Han ;
Wei, Yichen .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773
[8]   Instance-aware Semantic Segmentation via Multi-task Network Cascades [J].
Dai, Jifeng ;
He, Kaiming ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3150-3158
[9]  
Du Dawei, 2018, P EUR C COMP VIS
[10]   CenterNet: Keypoint Triplets for Object Detection [J].
Duan, Kaiwen ;
Bai, Song ;
Xie, Lingxi ;
Qi, Honggang ;
Huang, Qingming ;
Tian, Qi .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6568-6577