Where to Look?: Mining Complementary Image Regions for Weakly Supervised Object Localization

被引:22
作者
Babar, Sadbhavana [1 ]
Das, Sukhendu [1 ]
机构
[1] IIT Madras, Dept Comp Sci & Engn, Visualizat & Percept Lab, Chennai, Tamil Nadu, India
来源
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021) | 2021年
关键词
D O I
10.1109/WACV48630.2021.00105
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Humans possess an innate capability of recognizing objects and their corresponding parts and confine their attention to that location in a visual scene where the object is spatially present. Recently, efforts to train machines to mimic this ability of humans in the form of weakly supervised object localization, using training labels only at the image-level, have garnered a lot of attention. Nonetheless, one of the well-known problems that most of the existing methods suffer from is localizing only the most discriminative part of an object. Such methods provide very little or no focus on other pertinent parts of the object. In this paper, we propose a novel way of scrupulously localizing objects using training with labels as for the entire image by mining information from complementary regions in an image. Primarily, we adapt to regional dropout at complementary spatial locations to create two intermediate images. With the help of a novel Channel-wise Assisted Attention Module (CAAM) coupled with a Spatial Self-Attention Module (SSAM), we parallely train our model to leverage the information from complementary image regions for excellent localization. Finally, we fuse the attention maps generated by the two classifiers using our Attention-based Fusion Loss. Several experimental studies manifest the superior performance of our proposed approach. Our method demonstrates a significant increase in localization performance over the existing state-of-the-art methods on CUB-200-2011 and ILSVRC 2016 datasets.
引用
收藏
页码:1009 / 1018
页数:10
相关论文
共 52 条
  • [1] Weakly Supervised Learning of Instance Segmentation with Inter-pixel Relations
    Ahn, Jiwoon
    Cho, Sunghyun
    Kwak, Suha
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2204 - 2213
  • [2] Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
    Anderson, Peter
    He, Xiaodong
    Buehler, Chris
    Teney, Damien
    Johnson, Mark
    Gould, Stephen
    Zhang, Lei
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6077 - 6086
  • [3] Objects that Sound
    Arandjelovic, Relja
    Zisserman, Andrew
    [J]. COMPUTER VISION - ECCV 2018, PT I, 2018, 11205 : 451 - 466
  • [4] Self-Taught Object Localization with Deep Networks
    Bazzani, Loris
    Bergamo, Alessandro
    Anguelov, Dragomir
    Torresani, Lorenzo
    [J]. 2016 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2016), 2016,
  • [5] Benenson R., 2019, CVPR, P11700
  • [6] Evaluating Weakly Supervised Object Localization Methods Right
    Choe, Junsuk
    Oh, Seong Joon
    Lee, Seungho
    Chun, Sanghyuk
    Akata, Zeynep
    Shim, Hyunjung
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 3130 - 3139
  • [7] Attention-based Dropout Layer for Weakly Supervised Object Localization
    Choe, Junsuk
    Shim, Hyunjung
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2214 - 2223
  • [8] DeVries T., 2017, Improved regulariza
  • [9] Weakly Supervised Cascaded Convolutional Networks
    Diba, Ali
    Sharma, Vivek
    Pazandeh, Ali
    Pirsiavash, Hamed
    Van Gool, Luc
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5131 - 5139
  • [10] CenterNet: Keypoint Triplets for Object Detection
    Duan, Kaiwen
    Bai, Song
    Xie, Lingxi
    Qi, Honggang
    Huang, Qingming
    Tian, Qi
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6568 - 6577