Recurrent self-optimizing proposals for weakly supervised object detection

被引:0
作者
Zhang, Ming [1 ]
Zeng, Bing [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Informat & Commun Engn, 2006 Xiyuan Ave, Chengdu 610054, Sichuan, Peoples R China
基金
中国国家自然科学基金;
关键词
Weakly supervised object detection; Recurrent self-optimizing proposals; Proposal self-transformation; Proposal self-sampling;
D O I
10.1007/s00521-022-07818-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Weakly supervised object detection (WSOD) has attracted attention increasingly in object detection, as it only requires image-level annotations to train the detector. A typical paradigm for WSOD is to first generate candidate region proposals for the training data, and then each image is treated as a bag of proposals to conduct the training based on the multiple instance learning (MIL). Most methods focus on optimizing the training process, but rarely consider the influence of pre-generated proposals that directly affect the learning of the detector, due to the overwhelming noisy proposals (e.g., negative or background proposals) and positive proposals with inaccurate locations. In this paper, we focus on improving the quality of proposals, and propose a recurrent self-optimizing proposal framework, a new paradigm for WSOD, to iteratively optimize the pre-generated proposals. In each iteration, all detection results (i.e., the object-aware coordinate offsets and the confidence scores) are accumulated for proposal optimization. To achieve accurate object location, we design a proposal self-transformation module to transform the locations of pre-generated proposals based on the coordinate offsets. To alleviate the impact of noise proposals, we design a proposal self-sampling module to mine object instances through confidence scores to filter out noisy proposals. Furthermore, these optimized proposals are fed into a decoupled proposal learner, which contains two parallel proposal training branches. A MIL module and an instance refinement module are supervised by the image label and the mined object instances, respectively. In addition, the instance refinement module contains an instance regression refinement module, which is proposed to generate object-aware coordinate offsets. In turn, the decoupled proposal learner produces the new detection results to optimize proposals in the next iteration. Extensive experiments on PASCAL VOC and MS-COCO datasets demonstrate the effectiveness of our method.
引用
收藏
页码:757 / 771
页数:15
相关论文
共 65 条
[1]   Statistically correlated multi-task learning for autonomous driving [J].
Abbas, Waseem ;
Khan, Muhammad Fakhir ;
Taj, Murtaza ;
Mahmood, Arif .
NEURAL COMPUTING & APPLICATIONS, 2021, 33 (19) :12921-12938
[2]   Multiscale Combinatorial Grouping [J].
Arbelaez, Pablo ;
Pont-Tuset, Jordi ;
Barron, Jonathan T. ;
Marques, Ferran ;
Malik, Jitendra .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :328-335
[3]   Possibilistic rank-level fusion method for person re-identification [J].
Ben Slima, Ilef ;
Ammar, Sourour ;
Ghorbel, Mahmoud .
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (17) :14151-14168
[4]   Weakly Supervised Deep Detection Networks [J].
Bilen, Hakan ;
Vedaldi, Andrea .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2846-2854
[5]   High-Level Semantic Networks for Multi-Scale Object Detection [J].
Cao, Jiale ;
Pang, Yanwei ;
Zhao, Shengjie ;
Li, Xuelong .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (10) :3372-3386
[6]  
Carion N, 2020, European conference on computer vision, P213, DOI DOI 10.1007/978-3-030-58452-813
[7]  
Chen Z, 2020, PROC CVPR IEEE, P12992, DOI 10.1109/CVPR42600.2020.01301
[8]   Siamese Box Adaptive Network for Visual Tracking [J].
Chen, Zedu ;
Zhong, Bineng ;
Li, Guorong ;
Zhang, Shengping ;
Ji, Rongrong .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :6667-6676
[9]   High-Quality Proposals for Weakly Supervised Object Detection [J].
Cheng, Gong ;
Yang, Junyu ;
Gao, Decheng ;
Guo, Lei ;
Han, Junwei .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 (29) :5794-5804
[10]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848