Contrastive Tokens and Label Activation for Remote Sensing Weakly Supervised Semantic Segmentation

被引：2

作者：

Hu, Zaiyi ^{[1
]}

Gao, Junyu ^{[1
,2
]}

Yuan, Yuan ^{[1
]}

Li, Xuelong ^{[3
]}

机构：

[1] Northwestern Polytech Univ, Sch Artificial Intelligence Opt & Elect iOPEN, Xian 710072, Peoples R China

[2] Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China

[3] China Telecom Corp Ltd, Inst Artificial Intelligence TeleAI, Beijing 100033, Peoples R China

来源：

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷

关键词：

Remote sensing; Semantic segmentation; Training; Task analysis; Semantics; Convolutional neural networks; Transformers; Deep learning; remote sensing images; vision transformer (ViT); weakly supervised semantic segmentation (WSSS);

D O I：

10.1109/TGRS.2024.3385747

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

In recent years, there has been remarkable progress in weakly supervised semantic segmentation (WSSS), with vision transformer (ViT) architectures emerging as a natural fit for such tasks due to their inherent ability to leverage global attention for comprehensive object information perception. However, directly applying ViT to WSSS tasks can introduce challenges. The characteristics of ViT can lead to an oversmoothing problem, particularly in dense scenes of remote sensing images, significantly compromising the effectiveness of class activation maps (CAMs) and posing challenges for segmentation. Moreover, existing methods often adopt multistage strategies, adding complexity and reducing training efficiency. To overcome these challenges, a comprehensive framework Contrastive Token and Foreground Activation (CTFA) based on the ViT architecture for WSSS of remote sensing images is presented. Our proposed method includes a contrastive token learning module (CTLM), incorporating both patch-wise and class-wise token learning to enhance model performance. In patch-wise learning, we leverage the semantic diversity preserved in intermediate layers of ViT and derive a relation matrix from these layers and employ it to supervise the final output tokens, thereby improving the quality of CAM. In class-wise learning, we ensure the consistency of representation between global and local tokens, revealing more entire object regions. Additionally, by activating foreground features in the generated pseudo label using a dual-branch decoder, we further promote the improvement of CAM generation. Our approach demonstrates outstanding results across three well-established datasets, providing a more efficient and streamlined solution for WSSS. Code will be available at: https://github.com/ZaiyiHu/CTFA.

引用

页码：1 / 11

页数：11

共 50 条

[21] Enhancing Multiscale Representations With Transformer for Remote Sensing Image Semantic Segmentation
Xiao, Tao
Liu, Yikun
Huang, Yuwen
Li, Mingsong
Yang, Gongping
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[22] RSImageNet: A Universal Deep Semantic Segmentation Lifecycle for Remote Sensing Images
Qin, Yiqing
Chi, Mingmin
IEEE ACCESS, 2020, 8 : 68254 - 68267
[23] Swin Transformer Embedding UNet for Remote Sensing Image Semantic Segmentation
He, Xin
Zhou, Yong
Zhao, Jiaqi
Zhang, Di
Yao, Rui
Xue, Yong
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[24] Adaptive Multitype Contrastive Views Generation for Remote Sensing Image Semantic Segmentation
Shi, Cheng
Han, Peiwen
Zhao, Minghua
Fang, Li
Miao, Qiguang
Pun, Chi-Man
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
[25] Stealthy Adversarial Examples for Semantic Segmentation in Remote Sensing
Bai, Tao
Cao, Yiming
Xu, Yonghao
Wen, Bihan
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 17
[26] Pseudo-Label-Free Weakly Supervised Semantic Segmentation Using Image Masking
Kim, Sangtae
Luong Trung Nguyen
Shim, Kyuhong
Kim, Junhan
Shim, Byonghyo
IEEE ACCESS, 2022, 10 : 19401 - 19411
[27] Weakly Supervised Semantic Segmentation of Remote Sensing Images Using Siamese Affinity Network
Chen, Zheng
Lian, Yuheng
Bai, Jing
Zhang, Jingsen
Xiao, Zhu
Hou, Biao
REMOTE SENSING, 2025, 17 (05)
[28] WEAKLY SUPERVISED SEMANTIC SEGMENTATION OF REMOTE SENSING IMAGES FOR TREE SPECIES CLASSIFICATION BASED ON EXPLANATION METHODS
Ahlswede, Steve
Madam, Nimisha Thekke
Schulz, Christian
Kleinschmit, Birgit
Demir, Begum
2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 4847 - 4850
[29] An Alternating Guidance With Cross-View TeacherStudent Framework for Remote Sensing Semi-Supervised Semantic Segmentation
Fu, Yujia
Wang, Mingyang
Vivone, Gemine
Ding, Yunhong
Zhang, Lin
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
[30] Weakly Supervised Semantic Segmentation via Dual-Stream Contrastive Learning of Cross-Image Contextual Information
Lai, Qi
Vong, Chi-Man
Chen, Chuangquan
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (10) : 11635 - 11643

← 1 2 3 4 5 →