Multi-scale feature correspondence and pseudo label retraining strategy for weakly supervised semantic segmentation

被引:0
作者
Wang, Weizheng [1 ]
Zhou, Lei [1 ]
Wang, Haonan [1 ]
机构
[1] Changsha Univ Sci & Technol, Sch Comp & Commun Engn, Changsha 410076, Peoples R China
基金
中国国家自然科学基金;
关键词
Weakly supervised semantic segmentation; Vision transformer; Multi-scale feature correspondence; Pseudo label retraining strategy;
D O I
10.1016/j.imavis.2024.105215
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, the performance of semantic segmentation using weakly supervised learning has significantly improved. Weakly supervised semantic segmentation (WSSS) that uses only image-level labels has received widespread attention, it employs Class Activation Maps (CAM) to generate pseudo labels. Compared to traditional use of pixel-level labels, this technique greatly reduces annotation costs by utilizing simpler and more readily available image-level annotations. Besides, due to the local perceptual ability of Convolutional Neural Networks (CNN), the generated CAM cannot activate the entire object area. Researchers have found that this CNN limitation can be compensated for by using Vision Transformer (ViT). However, ViT also introduces an over-smoothing problem. Recent research has made good progress in solving this issue, but when discussing CAM and its related segmentation predictions, it is easy to overlook their intrinsic information and the interrelationships between them. In this paper, we propose a Multi-Scale Feature Correspondence (MSFC) method. Our MSFC can obtain the feature correspondence of CAM and segmentation predictions at different scales, reextract useful semantic information from them, enhancing the network's learning of feature information and improving the quality of CAM. Moreover, to further improve the segmentation precision, we design a Pseudo Label Retraining Strategy (PLRS). This strategy refines the accuracy in local regions, elevates the quality of pseudo labels, and aims to enhance segmentation precision. Experimental results on the PASCAL VOC 2012 and MS COCO 2014 datasets show that our method achieves impressive performance among end-to-end WSSS methods.
引用
收藏
页数:11
相关论文
共 53 条
  • [1] Learning Pixel-level Semantic Affinity with Image-level Supervision forWeakly Supervised Semantic Segmentation
    Ahn, Jiwoon
    Kwak, Suha
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4981 - 4990
  • [2] Single-Stage Semantic Segmentation from Image Labels
    Araslanov, Nikita
    Roth, Stefan
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 4252 - 4261
  • [3] What's the Point: Semantic Segmentation with Point Supervision
    Bearman, Amy
    Russakovsky, Olga
    Ferrari, Vittorio
    Fei-Fei, Li
    [J]. COMPUTER VISION - ECCV 2016, PT VII, 2016, 9911 : 549 - 565
  • [4] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
  • [5] Chang YT, 2020, PROC CVPR IEEE, P8988, DOI 10.1109/CVPR42600.2020.00901
  • [6] Chen L., 2020, P EUR C COMP VIS, P347
  • [7] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [8] Extracting Class Activation Maps from Non-Discriminative Features as well
    Chen, Zhaozheng
    Sun, Qianru
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 3135 - 3144
  • [9] Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation
    Chen, Zhaozheng
    Wang, Tan
    Wu, Xiongwei
    Hua, Xian-Sheng
    Zhang, Hanwang
    Sun, Qianru
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 959 - 968
  • [10] Deep Feature Factorization for Concept Discovery
    Collins, Edo
    Achanta, Radhakrishna
    Susstrunk, Sabine
    [J]. COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 : 352 - 368