Learning Object Detectors With Semi-Annotated Weak Labels

被引：20

作者：

Zhang, Dingwen ^{[1
]}

Han, Junwei ^{[2
]}

Guo, Guangyu ^{[2
]}

Zhao, Long ^{[2
]}

机构：

[1] Xidian Univ, Sch Mechanoelect Engn, Xian 710071, Shaanxi, Peoples R China

[2] Northwestern Polytech Univ, Sch Automat, Xian 710072, Shaanxi, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2019年 / 29卷 / 12期

基金：

国家重点研发计划; 美国国家科学基金会;

关键词：

Training; Object detection; Detectors; Training data; Generators; Visualization; Semantics; Computer vision; image processing; object detection; learning (artificial intelligence); LOCALIZATION; SALIENT; DEEP;

D O I：

10.1109/TCSVT.2018.2884173

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

For alleviating the human labor associated with annotating the training data for learning object detectors, recent research has focused on semi-supervised object detection (SSOD) and weakly supervised object detection (WSOD) approaches. In SSOD, instead of annotating all the instances in the whole training set, people only need to annotate the part of the training instances using bounding boxes. In WSOD, people need to annotate the image-level tags on all training images to indicate the object categories contained by the corresponding images since more detailed bounding box annotations are no longer needed. Along this line of research, this paper makes a further step to alleviate the human labor in annotating training data, leading to the problem of object detection with semi-annotated weak labels (ODSAWLs). Instead of labeling image-level tags on all training images, ODSAWL only needs the image-level tags for a small portion of the training images, and then, the object detectors can be learned from a small portion of the weakly-labeled training images and from the remaining unlabeled training images. To address such a challenging problem, this paper proposes a cross model co-training framework that collaborates an object localizer and a tag generator in an alternative optimization procedure. Specifically, during the learning procedure, these two (deep) models can transfer the needed knowledge (including labels and visual patterns) between each other. The whole learning procedure is accomplished in a few stages under the guidance of a progressive learning curriculum. To demonstrate the effectiveness of the proposed approach, we implement the comprehensive experiments on three benchmark datasets, where the obtained experimental results are quite encouraging. Notably, by using only about 15 weakly labeled training images, the proposed approach can effectively approach, or even outperform, the state-of-the-art WSOD methods.

引用

页码：3622 / 3635

页数：14

共 51 条

[1] [Anonymous], 2005, PROC CVPR IEEE
[2] Bengio Y., 2009, P 26 ANN INT C MACH, P41, DOI [DOI 10.1145/1553374.1553380.EVENT-PLACE, 10.1145/1553374.1553380, DOI 10.1145/1553374.15533802,5]
[3] Weakly Supervised Deep Detection Networks
Bilen, Hakan
Vedaldi, Andrea
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2846 - 2854
[4] Bilen H, 2015, PROC CVPR IEEE, P1081, DOI 10.1109/CVPR.2015.7298711
[5] Blum A., 1998, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, P92, DOI 10.1145/279943.279962
[6] Weakly Supervised Object Localization with Multi-Fold Multiple Instance Learning
Cinbis, Ramazan Gokberk
Verbeek, Jakob
Schmid, Cordelia
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (01) : 189 - 203
[7] The Pascal Visual Object Classes (VOC) Challenge
Everingham, Mark
Van Gool, Luc
Williams, Christopher K. I.
Winn, John
Zisserman, Andrew
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) : 303 - 338
[8] Fast R-CNN
Girshick, Ross
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1440 - 1448
[9] Multi-Modal Curriculum Learning for Semi-Supervised Image Classification
Gong, Chen
Tao, Dacheng
Maybank, Stephen J.
Liu, Wei
Kang, Guoliang
Yang, Jie
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (07) : 3249 - 3260
[10] A Unified Metric Learning-Based Framework for Co-Saliency Detection
Han, Junwei
Cheng, Gong
Li, Zhenpeng
Zhang, Dingwen
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (10) : 2473 - 2483

← 1 2 3 4 5 6 →