TSD-CAM: transformer-based self distillation with CAM similarity for weakly supervised semantic segmentation

被引：0

作者：

Yan, Lingyu ^{[1
]}

Chen, Jiangfeng ^{[1
]}

Tang, Yuanyan ^{[2
]}

机构：

[1] Hubei Univ Technol, Sch Comp Sci, Wuhan, Peoples R China

[2] Zhuhai UM Sci & Technol Res Inst, Zhuhai, Peoples R China

来源：

JOURNAL OF ELECTRONIC IMAGING | 2024年 / 33卷 / 02期

基金：

中国国家自然科学基金;

关键词：

weakly supervised semantic segmentation; class activation map; transformer; similarity; self distillation;

D O I：

10.1117/1.JEI.33.2.023029

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Weakly supervised semantic segmentation (WSSS) using only image-level labels is a challenging task. Most existing methods utilize class activation map (CAM) to generate pixel-level pseudo labels for supervised training. However, the gap between classification and segmentation hinders the network from obtaining more comprehensive semantic information and generating more accurate pseudo masks for segmentation. To address this issue, we propose TSD-CAM, a transformer-based self distillation (SD) method that utilizes CAM similarity. TSD-CAM uses the similarity between CAMs generated from different views as a distillation target, providing additional supervision for the network and narrowing the gap between classification and segmentation. SD supervision allows the network to acquire more semantic information and refine CAMs to generate higher precision pseudo-labels. In addition, we propose the adaptive pixel refinement module, which adaptively refines and adjusts images based on pixel variations, further improving the precision of pseudo labels. Our method is a fully end-to-end single-stage approach that achieves state-of-the-art 71.3% mIoU on PASCAL VOC 2012 and 42.9% mIoU on the MS COCO 2014 dataset, and the proposed TSD-CAM can significantly outperform other single-stage competitors and achieve comparable performance with state-of-the-art multi-stage methods. Meanwhile, the effectiveness of our method is demonstrated by a large number of ablation experiments, and we provide a new way of thinking to solve the problems of WSSS. Our code is available at: https://github.com/pipizhum/TSD-CAM.

引用

页数：20

共 60 条

[1] Weakly Supervised Learning of Instance Segmentation with Inter-pixel Relations
Ahn, Jiwoon
Cho, Sunghyun
Kwak, Suha
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2204 - 2213
[2] Learning Pixel-level Semantic Affinity with Image-level Supervision forWeakly Supervised Semantic Segmentation
Ahn, Jiwoon
Kwak, Suha
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4981 - 4990
[3] Single-Stage Semantic Segmentation from Image Labels
Araslanov, Nikita
Roth, Stefan
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 4252 - 4261
[4] What's the Point: Semantic Segmentation with Point Supervision
Bearman, Amy
Russakovsky, Olga
Ferrari, Vittorio
Fei-Fei, Li
[J]. COMPUTER VISION - ECCV 2016, PT VII, 2016, 9911 : 549 - 565
[5] Distill on the Go: Online knowledge distillation in self-supervised learning
Bhat, Prashant
Arani, Elahe
Zonooz, Bahram
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 2672 - 2681
[6] Self-supervised Image-specific Prototype Exploration for Weakly Supervised Semantic Segmentation
Chen, Qi
Yang, Lingxiao
Lai, Jianhuang
Xie, Xiaohua
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4278 - 4288
[7] Extracting Class Activation Maps from Non-Discriminative Features as well
Chen, Zhaozheng
Sun, Qianru
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 3135 - 3144
[8] Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation
Chen, Zhaozheng
Wang, Tan
Wu, Xiongwei
Hua, Xian-Sheng
Zhang, Hanwang
Sun, Qianru
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 959 - 968
[9] Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
[10] Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast
Du, Ye
Fu, Zehua
Liu, Qingjie
Wang, Yunhong
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4310 - 4319

← 1 2 3 4 5 6 →