DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution

被引:694
作者
Qiao, Siyuan [1 ]
Chen, Liang-Chieh [2 ]
Yuille, Alan [1 ]
机构
[1] Johns Hopkins Univ, Baltimore, MD 21218 USA
[2] Google Res, Mountain View, CA USA
来源
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年
关键词
COMPETITION; MECHANISMS;
D O I
10.1109/CVPR46437.2021.01008
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many modern object detectors demonstrate outstanding performances by using the mechanism of looking and thinking twice. In this paper, we explore this mechanism in the backbone design for object detection. At the macro level, we propose Recursive Feature Pyramid, which incorporates extra feedback connections from Feature Pyramid Networks into the bottom-up backbone layers. At the micro level, we propose Switchable Atrous Convolution, which convolves the features with different atrous rates and gathers the results using switch functions. Combining them results in DetectoRS, which significantly improves the performances of object detection. On COCO test-dev, DetectoRS achieves state-of-the-art 55.7% box AP for object detection, 48.5% mask AP for instance segmentation, and 50.0% PQ for panoptic segmentation. The code is made publicly available(1).
引用
收藏
页码:10208 / 10219
页数:12
相关论文
共 90 条
[1]  
[Anonymous], 2018, P EUR C COMP VIS
[2]   Top-down and bottom-up mechanisms in biasing competition in the human brain [J].
Beck, Diane M. ;
Kastner, Sabine .
VISION RESEARCH, 2009, 49 (10) :1154-1165
[3]   Soft-NMS - Improving Object Detection With One Line of Code [J].
Bodla, Navaneeth ;
Singh, Bharat ;
Chellappa, Rama ;
Davis, Larry S. .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5562-5570
[4]   A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection [J].
Cai, Zhaowei ;
Fan, Quanfu ;
Feris, Rogerio S. ;
Vasconcelos, Nuno .
COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :354-370
[5]   Look and Think Twice: Capturing Top-Down Visual Attention with Feedback Convolutional Neural Networks [J].
Cao, Chunshui ;
Liu, Xianming ;
Yang, Yi ;
Yu, Yinan ;
Wang, Jiang ;
Wang, Zilei ;
Huang, Yongzhen ;
Wang, Liang ;
Huang, Chang ;
Xu, Wei ;
Ramanan, Deva ;
Huang, Thomas S. .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2956-2964
[6]   MegDet: A Large Mini-Batch Object Detector [J].
Peng, Chao ;
Xiao, Tete ;
Li, Zeming ;
Jiang, Yuning ;
Zhang, Xiangyu ;
Jia, Kai ;
Yu, Gang ;
Sun, Jian .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6181-6189
[7]  
Chen K., 2019, arXiv:1906.07155
[8]   Hybrid Task Cascade for Instance Segmentation [J].
Chen, Kai ;
Pang, Jiangmiao ;
Wang, Jiaqi ;
Xiong, Yu ;
Li, Xiaoxiao ;
Sun, Shuyang ;
Feng, Wansen ;
Liu, Ziwei ;
Shi, Jianping ;
Ouyang, Wanli ;
Loy, Chen Change ;
Lin, Dahua .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4969-4978
[9]   MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features [J].
Chen, Liang-Chieh ;
Hermans, Alexander ;
Papandreou, George ;
Schroff, Florian ;
Wang, Peng ;
Adam, Hartwig .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4013-4022
[10]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848