VarifocalNet: An IoU-aware Dense Object Detector

被引:697
作者
Zhang, Haoyang [1 ]
Wang, Ying [2 ]
Dayoub, Feras [1 ]
Sunderhauf, Niko [1 ]
机构
[1] Queensland Univ Technol, Australian Ctr Robot Vis, Brisbane, Qld, Australia
[2] Univ Queensland, Brisbane, Qld, Australia
来源
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年
关键词
D O I
10.1109/CVPR46437.2021.00841
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Accurately ranking the vast number of candidate detections is crucial for dense object detectors to achieve high performance. Prior work uses the classification score or a combination of classification and predicted localization scores to rank candidates. However, neither option results in a reliable ranking, thus degrading detection performance. In this paper, we propose to learn an Iou-Aware Classification Score (IACS) as a joint representation of object presence confidence and localization accuracy. We show that dense object detectors can achieve a more accurate ranking of candidate detections based on the IACS. We design a new loss function, named Varifocal Loss, to train a dense object detector to predict the IACS, and propose a new star-shaped bounding box feature representation for IACS prediction and bounding box refinement. Combining these two new components and a bounding box refinement branch, we build an IoU-aware dense object detector based on the FCOS+ATSS architecture, that we call VarifocalNet or VFNet for short. Extensive experiments on MS COCO show that our VFNet consistently surpasses the strong baseline by similar to 2.0 AP with different backbones. Our best model VFNet-X-1200 with Res2Net-101-DCN achieves a single-model single-scale AP of 55.1 on COCO test-dev, which is state-of-the-art among various object detectors.
引用
收藏
页码:8510 / 8519
页数:10
相关论文
共 53 条
[1]   Soft-NMS - Improving Object Detection With One Line of Code [J].
Bodla, Navaneeth ;
Singh, Bharat ;
Chellappa, Rama ;
Davis, Larry S. .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5562-5570
[2]   Hierarchical Shot Detector [J].
Cao, Jiale ;
Pang, Yanwei ;
Han, Jungong ;
Li, Xuelong .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9704-9713
[3]   Prime Sample Attention in Object Detection [J].
Cao, Yuhang ;
Chen, Kai ;
Loy, Chen Change ;
Lin, Dahua .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11580-11588
[4]  
Chen Kai, 2019, ARXIV190607155
[5]   Deformable Convolutional Networks [J].
Dai, Jifeng ;
Qi, Haozhi ;
Xiong, Yuwen ;
Li, Yi ;
Zhang, Guodong ;
Hu, Han ;
Wei, Yichen .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773
[6]  
Dai J, 2016, PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), P1796, DOI 10.1109/ICIT.2016.7475036
[7]   CenterNet: Keypoint Triplets for Object Detection [J].
Duan, Kaiwen ;
Bai, Song ;
Xie, Lingxi ;
Qi, Honggang ;
Huang, Qingming ;
Tian, Qi .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6568-6577
[8]  
Fu C.Y., 2017, ARXIV
[9]   Res2Net: A New Multi-Scale Backbone Architecture [J].
Gao, Shang-Hua ;
Cheng, Ming-Ming ;
Zhao, Kai ;
Zhang, Xin-Yu ;
Yang, Ming-Hsuan ;
Torr, Philip .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (02) :652-662
[10]   Region-Based Convolutional Networks for Accurate Object Detection and Segmentation [J].
Girshick, Ross ;
Donahue, Jeff ;
Darrell, Trevor ;
Malik, Jitendra .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (01) :142-158