LayerCAM: Exploring Hierarchical Class Activation Maps for Localization

被引:399
作者
Jiang, Peng-Tao [1 ]
Zhang, Chang-Bin [1 ]
Hou, Qibin [2 ]
Cheng, Ming-Ming [1 ]
Wei, Yunchao [3 ]
机构
[1] Nankai Univ, TKLNDST, CS, Tianjin 300071, Peoples R China
[2] NUS, Dept Elect & Comp Engn, Singapore 119077, Singapore
[3] Beijing Jiaotong Univ, Inst Informat Sci, Beijing 10044, Peoples R China
关键词
Location awareness; Task analysis; Semantics; Image segmentation; Reliability; Convolution; Spatial resolution; Weakly-supervised object localization; class activation maps; SUPERVISED OBJECT LOCALIZATION; DEFECT DETECTION; SEGMENTATION; ATTENTION;
D O I
10.1109/TIP.2021.3089943
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The class activation maps are generated from the final convolutional layer of CNN. They can highlight discriminative object regions for the class of interest. These discovered object regions have been widely used for weakly-supervised tasks. However, due to the small spatial resolution of the final convolutional layer, such class activation maps often locate coarse regions of the target objects, limiting the performance of weakly-supervised tasks that need pixel-accurate object locations. Thus, we aim to generate more fine-grained object localization information from the class activation maps to locate the target objects more accurately. In this paper, by rethinking the relationships between the feature maps and their corresponding gradients, we propose a simple yet effective method, called LayerCAM. It can produce reliable class activation maps for different layers of CNN. This property enables us to collect object localization information from coarse (rough spatial localization) to fine (precise fine-grained details) levels. We further integrate them into a high-quality class activation map, where the object-related pixels can be better highlighted. To evaluate the quality of the class activation maps produced by LayerCAM, we apply them to weakly-supervised object localization and semantic segmentation. Experiments demonstrate that the class activation maps generated by our method are more effective and reliable than those by the existing attention methods. The code will be made publicly available.
引用
收藏
页码:5875 / 5888
页数:14
相关论文
共 96 条
  • [31] ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization
    Kantorov, Vadim
    Oquab, Maxime
    Cho, Minsu
    Laptev, Ivan
    [J]. COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 : 350 - 365
  • [32] Two-Phase Learning for Weakly Supervised Object Localization
    Kim, Dahun
    Cho, Donghyeon
    Yoo, Donggeun
    Kweon, In So
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 3554 - 3563
  • [33] Progressive Representation Adaptation for Weakly Supervised Object Localization
    Li, Dong
    Huang, Jia-Bin
    Li, Yali
    Wang, Shengjin
    Yang, Ming-Hsuan
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (06) : 1424 - 1438
  • [34] Tell Me Where to Look: Guided Attention Inference Network
    Li, Kunpeng
    Wu, Ziyan
    Peng, Kuan-Chuan
    Ernst, Jan
    Fu, Yun
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 9215 - 9223
  • [35] Weaklier Supervised Semantic Segmentation With Only One Image Level Annotation per Category
    Li, Xi
    Ma, Huimin
    Luo, Xiong
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 128 - 141
  • [36] Li Xueyi, 2020, ARXIV201205007
  • [37] RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation
    Lin, Guosheng
    Milan, Anton
    Shen, Chunhua
    Reid, Ian
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5168 - 5177
  • [38] Feature Pyramid Networks for Object Detection
    Lin, Tsung-Yi
    Dollar, Piotr
    Girshick, Ross
    He, Kaiming
    Hariharan, Bharath
    Belongie, Serge
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 936 - 944
  • [39] DHSNet: Deep Hierarchical Saliency Network for Salient Object Detection
    Liu, Nian
    Han, Junwei
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 678 - 686
  • [40] SSD: Single Shot MultiBox Detector
    Liu, Wei
    Anguelov, Dragomir
    Erhan, Dumitru
    Szegedy, Christian
    Reed, Scott
    Fu, Cheng-Yang
    Berg, Alexander C.
    [J]. COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 : 21 - 37