Mask Scoring R-CNN

被引:857
作者
Huang, Zhaojin [1 ,2 ]
Huang, Lichao [2 ]
Gong, Yongchao [2 ]
Huang, Chang [2 ]
Wang, Xinggang [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch EIC, Inst AI, Wuhan, Peoples R China
[2] Horizon Robot Inc, Beijing, Peoples R China
来源
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年
关键词
D O I
10.1109/CVPR.2019.00657
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Letting a deep network be aware of the quality of its own predictions is an interesting yet important problem. In the task of instance segmentation, the confidence of instance classification is used as mask quality score in most instance segmentation frameworks. However, the mask quality, quantified as the IoU between the instance mask and its ground truth, is usually not well correlated with classification score. In this paper, we study this problem and propose Mask Scoring R-CNN which contains a network block to learn the quality of the predicted instance masks. The proposed network block takes the instance feature and the corresponding predicted mask together to regress the mask IoU. The mask scoring strategy calibrates the misalignment between mask quality and mask score, and improves instance segmentation performance by prioritizing more accurate mask predictions during COCO AP evaluation. By extensive evaluations on the COCO dataset, Mask Scoring R-CNN brings consistent and noticeable gain with different models and outperforms the state-of-the-art Mask RCNN. We hope our simple and effective approach will provide a new direction for improving instance segmentation. The source code of our method is available at htpps://github.com/zjhuang22/maskscoring_rcnn.
引用
收藏
页码:6402 / 6411
页数:10
相关论文
共 36 条
[1]  
[Anonymous], 2016, ARXIV161108991
[2]  
[Anonymous], 2017, ARXIV171100164
[3]  
[Anonymous], 2017, ARXIV171204837
[4]   Deep Watershed Transform for Instance Segmentation [J].
Bai, Min ;
Urtasun, Raquel .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2858-2866
[5]   Soft-NMS - Improving Object Detection With One Line of Code [J].
Bodla, Navaneeth ;
Singh, Bharat ;
Chellappa, Rama ;
Davis, Larry S. .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5562-5570
[6]  
Chen HT, 2018, CHIN AUTOM CONGR, P881, DOI 10.1109/CAC.2018.8623182
[7]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[8]  
Cheng L, 2018, 2018 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS 2018), P473, DOI 10.1109/APCCAS.2018.8605613
[9]   Deformable Convolutional Networks [J].
Dai, Jifeng ;
Qi, Haozhi ;
Xiong, Yuwen ;
Li, Yi ;
Zhang, Guodong ;
Hu, Han ;
Wei, Yichen .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773
[10]   Instance-aware Semantic Segmentation via Multi-task Network Cascades [J].
Dai, Jifeng ;
He, Kaiming ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3150-3158