Multi-Perspective Pseudo-Label Generation and Confidence-Weighted Training for Semi-Supervised Semantic Segmentation

被引：0

作者：

Hu, Kai ^{[1
]}

Chen, Xiaobo ^{[1
]}

Chen, Zhineng ^{[2
]}

Zhang, Yuan ^{[1
]}

Gao, Xieping ^{[3
]}

机构：

[1] Xiangtan Univ, Minist Educ, Key Lab Intelligent Comp & Informat Proc, Xiangtan 411105, Peoples R China

[2] Fudan Univ, Sch Comp Sci, Shanghai 200438, Peoples R China

[3] Hunan Normal Univ, Hunan Prov Key Lab Intelligent Comp & Language Inf, Changsha 410081, Peoples R China

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2025年 / 27卷

基金：

中国国家自然科学基金;

关键词：

Training; Semantic segmentation; Predictive models; Data models; Perturbation methods; Semantics; Semisupervised learning; Data augmentation; Supervised learning; Stability analysis; Self-training; semi-supervised learning; semantic segmentation;

D O I：

10.1109/TMM.2024.3521801

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Self-training has been shown to achieve remarkable gains in semi-supervised semantic segmentation by creating pseudo-labels using unlabeled data. This approach, however, suffers from the quality of the generated pseudo-labels, and generating higher quality pseudo-labels is the main challenge that needs to be addressed. In this paper, we propose a novel method for semi-supervised semantic segmentation based on Multi-perspective pseudo-label Generation and Confidence-weighted Training (MGCT). First, we present a multi-perspective pseudo-label generation strategy that considers both global and local semantic perspectives. This strategy prioritizes pixels in all images by the global and local predictions, and subsequently generates pseudo-labels for different pixels in stages according to the ranking results. Our pseudo-label generation method shows superior suitability for semi-supervised semantic segmentation compared to other approaches. Second, we propose a confidence-weighted training method to alleviate performance degradation caused by unstable pixels. Our training method assigns confident weights to unstable pixels, which reduces the interference of unstable pixels during training and facilitates the efficient training of the model. Finally, we validate our approach on the PASCAL VOC 2012 and Cityscapes datasets, and the results indicate that we achieve new state-of-the-art performance on both datasets in all settings.

引用

页码：300 / 311

页数：12

共 39 条

[1]

Bachman P, 2014, ADV NEUR IN, V27

[2]

Berthelot D, 2019, ADV NEUR IN, V32

[3]

Cascante-Bonilla P, 2021, AAAI CONF ARTIF INTE, V35, P6912

[4] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].

Chen, Liang-Chieh ;

Zhu, Yukun ;

Papandreou, George ;

Schroff, Florian ;

Adam, Hartwig .

COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851

[5] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[6] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision [J].

Chen, Xiaokang ;

Yuan, Yuhui ;

Zeng, Gang ;

Wang, Jingdong .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :2613-2622

[7] The Cityscapes Dataset for Semantic Urban Scene Understanding [J].

Cordts, Marius ;

Omran, Mohamed ;

Ramos, Sebastian ;

Rehfeld, Timo ;

Enzweiler, Markus ;

Benenson, Rodrigo ;

Franke, Uwe ;

Roth, Stefan ;

Schiele, Bernt .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223

[8]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[9]

DeVries T, 2017, Arxiv, DOI arXiv:1708.04552

[10] The PASCAL Visual Object Classes Challenge: A Retrospective [J].

Everingham, Mark ;

Eslami, S. M. Ali ;

Van Gool, Luc ;

Williams, Christopher K. I. ;

Winn, John ;

Zisserman, Andrew .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) :98-136

← 1 2 3 4 →