Weakly Supervised Object Detection With Segmentation Collaboration

被引:88
作者
Li, Xiaoyan [1 ,2 ]
Kan, Meina [1 ,2 ]
Shan, Shiguang [1 ,2 ,3 ]
Chen, Xilin [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc Chinese Acad Sc, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Peng Cheng Lab, Shenzhen 518055, Peoples R China
来源
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) | 2019年
关键词
D O I
10.1109/ICCV.2019.00983
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Weakly supervised object detection aims at learning precise object detectors, given image category labels. In recent prevailing works, this problem is generally formulated as a multiple instance learning module guided by an image classification loss. The object bounding box is assumed to be the one contributing most to the classification among all proposals. However, the region contributing most is also likely to be a crucial part or the supporting context of an object. To obtain a more accurate detector, in this work we propose a novel end-to-end weakly supervised detection approach, where a newly introduced generative adversarial segmentation module interacts with the conventional detection module in a collaborative loop. The collaboration mechanism takes full advantages of the complementary interpretations of the weakly supervised localization task, namely detection and segmentation tasks, forming a more comprehensive solution. Consequently, our method obtains more precise object bounding boxes, rather than parts or irrelevant surroundings. Expectedly, the proposed method achieves an accuracy of 53.7% on the PASCAL VOC 2007 dataset, outperforming the state-of-the-arts and demonstrating its superiority for weakly supervised object detection.
引用
收藏
页码:9734 / 9743
页数:10
相关论文
共 34 条
[1]  
Bilen H., 2014, Proceedings BMVC 2014, V2014, P1
[2]   Weakly Supervised Deep Detection Networks [J].
Bilen, Hakan ;
Vedaldi, Andrea .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2846-2854
[3]  
Blaschko M., 2010, NEURAL INFORM PROCES, P235
[4]   Cascaded Pyramid Network for Multi-Person Pose Estimation [J].
Chen, Yilun ;
Wang, Zhicheng ;
Peng, Yuxiang ;
Zhang, Zhiqiang ;
Yu, Gang ;
Sun, Jian .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7103-7112
[5]   Multi-fold MIL Training for Weakly Supervised Object Localization [J].
Cinbis, Ramazan Gokberk ;
Verbeek, Jakob ;
Schmid, Cordelia .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :2409-2416
[6]   Weakly Supervised Localization and Learning with Generic Knowledge [J].
Deselaers, Thomas ;
Alexe, Bogdan ;
Ferrari, Vittorio .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2012, 100 (03) :275-293
[7]  
Deselaers T, 2010, LECT NOTES COMPUT SC, V6314, P452, DOI 10.1007/978-3-642-15561-1_33
[8]   Weakly Supervised Cascaded Convolutional Networks [J].
Diba, Ali ;
Sharma, Vivek ;
Pazandeh, Ali ;
Pirsiavash, Hamed ;
Van Gool, Luc .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5131-5139
[9]   Solving the multiple instance problem with axis-parallel rectangles [J].
Dietterich, TG ;
Lathrop, RH ;
LozanoPerez, T .
ARTIFICIAL INTELLIGENCE, 1997, 89 (1-2) :31-71
[10]  
DURAND T, 2017, PROC CVPR IEEE, P5957, DOI DOI 10.1109/CVPR.2017.631