CPEWS: Contextual Prototype-Based End-to-End Weakly Supervised Semantic Segmentation

被引:0
作者
Shao, Xiaoyan [1 ]
Han, Jiaqi [1 ]
Li, Lingling [1 ]
Zhao, Xuezhuan [1 ,2 ,3 ,4 ]
Yan, Jingjing [1 ]
机构
[1] Zhengzhou Univ Aeronaut, Sch Comp Sci, Zhengzhou 450046, Peoples R China
[2] Natl Key Lab Air Based Informat Percept & Fus, Luoyang 471000, Peoples R China
[3] Chongqing Res Inst, Harbin Inst Technol, Chongqing 401151, Peoples R China
[4] Aerosp Elect Informat Technol, Henan Collaborat Innovat Ctr, Zhengzhou 401151, Peoples R China
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2025年 / 83卷 / 01期
基金
中国国家自然科学基金;
关键词
End-to-end weakly supervised semantic segmentation; vision transformer; contextual prototype; class activation map;
D O I
10.32604/cmc.2025.060295
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The primary challenge in weakly supervised semantic segmentation is effectively leveraging weak annotations while minimizing the performance gap compared to fully supervised methods. End-to-end model designs have gained significant attention for improving training efficiency. Most current algorithms rely on Convolutional Neural Networks (CNNs) for feature extraction. Although CNNs are proficient at capturing local features, they often struggle with global context, leading to incomplete and false Class Activation Mapping (CAM). To address these limitations, this work proposes a Contextual Prototype-Based End-to-End Weakly Supervised Semantic Segmentation (CPEWS) model, which improves feature extraction by utilizing the Vision Transformer (ViT). By incorporating its intermediate feature layers to preserve semantic information, this work introduces the Intermediate Supervised Module (ISM) to supervise the final layer's output, reducing boundary ambiguity and mitigating issues related to incomplete activation. Additionally, the Contextual Prototype Module (CPM) generates class-specific prototypes, while the proposed Prototype Discrimination Loss (LPDL) and Superclass Suppression Loss (LSSL) guide the network's training, effectively addressing false activation without the need for extra supervision. The CPEWS model proposed in this paper achieves state-of-the-art performance in end-to-end weakly supervised semantic segmentation without additional supervision. The validation set and test set Mean Intersection over Union (MIoU) of PASCAL VOC 2012 dataset achieved 69.8% and 72.6%, respectively. Compared with ToCo (pre trained weight ImageNet-1k), MIoU on the test set is 2.1% higher. In addition, MIoU reached 41.4% on the validation set of the MS COCO 2014 dataset.
引用
收藏
页码:595 / 617
页数:23
相关论文
共 43 条
[1]  
Akiva P, 2021, Arxiv, DOI arXiv:2106.10309
[2]   FPR: False Positive Rectification for Weakly Supervised Semantic Segmentation [J].
Chen, Liyi ;
Lei, Chenyang ;
Li, Ruihuang ;
Li, Shuai ;
Zhang, Zhaoxiang ;
Zhang, Lei .
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, :1108-1118
[3]   Self-supervised Image-specific Prototype Exploration for Weakly Supervised Semantic Segmentation [J].
Chen, Qi ;
Yang, Lingxiao ;
Lai, Jianhuang ;
Xie, Xiaohua .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :4278-4288
[4]   Multi-Granularity Denoising and Bidirectional Alignment for Weakly Supervised Semantic Segmentation [J].
Chen, Tao ;
Yao, Yazhou ;
Tang, Jinhui .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 :2960-2971
[5]   C-CAM: Causal CAM for Weakly Supervised Semantic Segmentation on Medical Image [J].
Chen, Zhang ;
Tian, Zhiqiang ;
Zhu, Jihua ;
Li, Ce ;
Du, Shaoyi .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :11666-11675
[6]   Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation [J].
Chen, Zhaozheng ;
Wang, Tan ;
Wu, Xiongwei ;
Hua, Xian-Sheng ;
Zhang, Hanwang ;
Sun, Qianru .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :959-968
[7]   Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast [J].
Du, Ye ;
Fu, Zehua ;
Liu, Qingjie ;
Wang, Yunhong .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :4310-4319
[8]   The PASCAL Visual Object Classes Challenge: A Retrospective [J].
Everingham, Mark ;
Eslami, S. M. Ali ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) :98-136
[9]   BREAST LESION SEGMENTATION AND CLASSIFICATION USING U-NET SALIENCY ESTIMATION AND EXPLAINABLE RESIDUAL CONVOLUTIONAL NEURAL NETWORK [J].
Fatima, Mamuna ;
Khan, Muhammad attique ;
Shaheen, Saima ;
Albarakati, Hussain mobarak ;
Wang, Shuihua ;
Jilani, Syeda fizzah ;
Shabaz, Mohammad .
FRACTALS-COMPLEX GEOMETRY PATTERNS AND SCALING IN NATURE AND SOCIETY, 2024,
[10]  
Jang S, 2024, Arxiv, DOI arXiv:2409.15801