Min-Entropy Latent Model for Weakly Supervised Object Detection

被引：141

作者：

Wan, Fang ^{[1
]}

Wei, Pengxu ^{[1
]}

Jiao, Jianbin ^{[1
]}

Han, Zhenjun ^{[1
]}

Ye, Qixiang ^{[1
]}

机构：

[1] Univ Chinese Acad Sci, Beijing, Peoples R China

来源：

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年

关键词：

LOCALIZATION;

D O I：

10.1109/CVPR.2018.00141

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Weakly supervised object detection is a challenging task when provided with image category supervision but required to learn, at the same time, object locations and object detectors. The inconsistency between the weak supervision and learning objectives introduces randomness to object locations and ambiguity to detectors. In this paper, a min-entropy latent model (MELM) is proposed for weakly supervised object detection. Min-entropy is used as a metric to measure the randomness of object localization during learning, as well as serving as a model to learn object locations. It aims to principally reduce the variance of positive instances and alleviate the ambiguity of detectors. MELM is deployed as two sub-models, which respectively discovers and localizes objects by minimizing the global and local entropy. MELM is unified with feature learning and optimized with a recurrent learning algorithm, which progressively transfers the weak supervision to object locations. Experiments demonstrate that MELM significantly improves the performance of weakly supervised detection, weakly supervised localization, and image classification, against the state-of-the-art approaches.

引用

页码：1297 / 1306

页数：10

共 46 条

[1] ACZEL J, 1963, ACTA MATH ACAD SCI H, V14, P95, DOI DOI 10.1007/BF01901932
[2] Weakly Supervised Localization Using Deep Feature Maps
Bency, Archith John
Kwon, Heesung
Lee, Hyungtae
Karthikeyan, S.
Manjunath, B. S.
[J]. COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 : 714 - 731
[3] Weakly Supervised Deep Detection Networks
Bilen, Hakan
Vedaldi, Andrea
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2846 - 2854
[4] Bilen H, 2015, PROC CVPR IEEE, P1081, DOI 10.1109/CVPR.2015.7298711
[5] Bilen Hakan, 2014, P BRIT MACH VIS C
[6] Entropy-based Latent Structured Output Prediction
Bouchacourt, Diane
Nowozin, Sebastian
Kumar, M. Pawan
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2920 - 2928
[7] Weakly Supervised Object Localization with Multi-Fold Multiple Instance Learning
Cinbis, Ramazan Gokberk
Verbeek, Jakob
Schmid, Cordelia
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (01) : 189 - 203
[8] Multi-fold MIL Training for Weakly Supervised Object Localization
Cinbis, Ramazan Gokberk
Verbeek, Jakob
Schmid, Cordelia
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 2409 - 2416
[9] Dai J., 2016, ADV NEURAL INFORM PR, V29, P379, DOI [DOI 10.1016/J.JPOWSOUR.2007.02.075, DOI 10.48550/ARXIV.1605.06409, DOI 10.1109/CVPR.2017.690]
[10] Weakly Supervised Localization and Learning with Generic Knowledge
Deselaers, Thomas
Alexe, Bogdan
Ferrari, Vittorio
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2012, 100 (03) : 275 - 293

← 1 2 3 4 5 →