TSD-CAM: transformer-based self distillation with CAM similarity for weakly supervised semantic segmentation

被引:0
作者
Yan, Lingyu [1 ]
Chen, Jiangfeng [1 ]
Tang, Yuanyan [2 ]
机构
[1] Hubei Univ Technol, Sch Comp Sci, Wuhan, Peoples R China
[2] Zhuhai UM Sci & Technol Res Inst, Zhuhai, Peoples R China
基金
中国国家自然科学基金;
关键词
weakly supervised semantic segmentation; class activation map; transformer; similarity; self distillation;
D O I
10.1117/1.JEI.33.2.023029
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Weakly supervised semantic segmentation (WSSS) using only image-level labels is a challenging task. Most existing methods utilize class activation map (CAM) to generate pixel-level pseudo labels for supervised training. However, the gap between classification and segmentation hinders the network from obtaining more comprehensive semantic information and generating more accurate pseudo masks for segmentation. To address this issue, we propose TSD-CAM, a transformer-based self distillation (SD) method that utilizes CAM similarity. TSD-CAM uses the similarity between CAMs generated from different views as a distillation target, providing additional supervision for the network and narrowing the gap between classification and segmentation. SD supervision allows the network to acquire more semantic information and refine CAMs to generate higher precision pseudo-labels. In addition, we propose the adaptive pixel refinement module, which adaptively refines and adjusts images based on pixel variations, further improving the precision of pseudo labels. Our method is a fully end-to-end single-stage approach that achieves state-of-the-art 71.3% mIoU on PASCAL VOC 2012 and 42.9% mIoU on the MS COCO 2014 dataset, and the proposed TSD-CAM can significantly outperform other single-stage competitors and achieve comparable performance with state-of-the-art multi-stage methods. Meanwhile, the effectiveness of our method is demonstrated by a large number of ablation experiments, and we provide a new way of thinking to solve the problems of WSSS. Our code is available at: https://github.com/pipizhum/TSD-CAM.
引用
收藏
页数:20
相关论文
共 60 条
  • [1] Weakly Supervised Learning of Instance Segmentation with Inter-pixel Relations
    Ahn, Jiwoon
    Cho, Sunghyun
    Kwak, Suha
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2204 - 2213
  • [2] Learning Pixel-level Semantic Affinity with Image-level Supervision forWeakly Supervised Semantic Segmentation
    Ahn, Jiwoon
    Kwak, Suha
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4981 - 4990
  • [3] Single-Stage Semantic Segmentation from Image Labels
    Araslanov, Nikita
    Roth, Stefan
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 4252 - 4261
  • [4] What's the Point: Semantic Segmentation with Point Supervision
    Bearman, Amy
    Russakovsky, Olga
    Ferrari, Vittorio
    Fei-Fei, Li
    [J]. COMPUTER VISION - ECCV 2016, PT VII, 2016, 9911 : 549 - 565
  • [5] Distill on the Go: Online knowledge distillation in self-supervised learning
    Bhat, Prashant
    Arani, Elahe
    Zonooz, Bahram
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 2672 - 2681
  • [6] Self-supervised Image-specific Prototype Exploration for Weakly Supervised Semantic Segmentation
    Chen, Qi
    Yang, Lingxiao
    Lai, Jianhuang
    Xie, Xiaohua
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4278 - 4288
  • [7] Extracting Class Activation Maps from Non-Discriminative Features as well
    Chen, Zhaozheng
    Sun, Qianru
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 3135 - 3144
  • [8] Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation
    Chen, Zhaozheng
    Wang, Tan
    Wu, Xiongwei
    Hua, Xian-Sheng
    Zhang, Hanwang
    Sun, Qianru
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 959 - 968
  • [9] Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
  • [10] Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast
    Du, Ye
    Fu, Zehua
    Liu, Qingjie
    Wang, Yunhong
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4310 - 4319