Min-Entropy Latent Model for Weakly Supervised Object Detection

被引:141
作者
Wan, Fang [1 ]
Wei, Pengxu [1 ]
Jiao, Jianbin [1 ]
Han, Zhenjun [1 ]
Ye, Qixiang [1 ]
机构
[1] Univ Chinese Acad Sci, Beijing, Peoples R China
来源
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年
关键词
LOCALIZATION;
D O I
10.1109/CVPR.2018.00141
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Weakly supervised object detection is a challenging task when provided with image category supervision but required to learn, at the same time, object locations and object detectors. The inconsistency between the weak supervision and learning objectives introduces randomness to object locations and ambiguity to detectors. In this paper, a min-entropy latent model (MELM) is proposed for weakly supervised object detection. Min-entropy is used as a metric to measure the randomness of object localization during learning, as well as serving as a model to learn object locations. It aims to principally reduce the variance of positive instances and alleviate the ambiguity of detectors. MELM is deployed as two sub-models, which respectively discovers and localizes objects by minimizing the global and local entropy. MELM is unified with feature learning and optimized with a recurrent learning algorithm, which progressively transfers the weak supervision to object locations. Experiments demonstrate that MELM significantly improves the performance of weakly supervised detection, weakly supervised localization, and image classification, against the state-of-the-art approaches.
引用
收藏
页码:1297 / 1306
页数:10
相关论文
共 46 条
  • [1] ACZEL J, 1963, ACTA MATH ACAD SCI H, V14, P95, DOI DOI 10.1007/BF01901932
  • [2] Weakly Supervised Localization Using Deep Feature Maps
    Bency, Archith John
    Kwon, Heesung
    Lee, Hyungtae
    Karthikeyan, S.
    Manjunath, B. S.
    [J]. COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 : 714 - 731
  • [3] Weakly Supervised Deep Detection Networks
    Bilen, Hakan
    Vedaldi, Andrea
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2846 - 2854
  • [4] Bilen H, 2015, PROC CVPR IEEE, P1081, DOI 10.1109/CVPR.2015.7298711
  • [5] Bilen Hakan, 2014, P BRIT MACH VIS C
  • [6] Entropy-based Latent Structured Output Prediction
    Bouchacourt, Diane
    Nowozin, Sebastian
    Kumar, M. Pawan
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2920 - 2928
  • [7] Weakly Supervised Object Localization with Multi-Fold Multiple Instance Learning
    Cinbis, Ramazan Gokberk
    Verbeek, Jakob
    Schmid, Cordelia
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (01) : 189 - 203
  • [8] Multi-fold MIL Training for Weakly Supervised Object Localization
    Cinbis, Ramazan Gokberk
    Verbeek, Jakob
    Schmid, Cordelia
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 2409 - 2416
  • [9] Dai J., 2016, ADV NEURAL INFORM PR, V29, P379, DOI [DOI 10.1016/J.JPOWSOUR.2007.02.075, DOI 10.48550/ARXIV.1605.06409, DOI 10.1109/CVPR.2017.690]
  • [10] Weakly Supervised Localization and Learning with Generic Knowledge
    Deselaers, Thomas
    Alexe, Bogdan
    Ferrari, Vittorio
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2012, 100 (03) : 275 - 293