Denoising Pretraining for Semantic Segmentation

被引：76

作者：

Brempong, Emmanuel Asiedu ^{[1
,2
]}

Kornblith, Simon ^{[1
]}

Chen, Ting ^{[1
]}

Parmar, Niki ^{[1
]}

Minderer, Matthias ^{[1
]}

Norouzi, Mohammad ^{[1
]}

机构：

[1] Google Res, Mountain View, CA 94043 USA

[2] Google AI Residency, Mountain View, CA USA

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022 | 2022年

关键词：

D O I：

10.1109/CVPRW56347.2022.00462

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Semantic segmentation labels are expensive and time consuming to acquire. To improve label efficiency of semantic segmentation models, we revisit denoising autoencoders and study the use of a denoising objective for pretraining UNets. We pretrain a Transformer-based UNet as a denoising autoencoder, followed by fine-tuning on semantic segmentation using few labeled examples. Denoising pretraining outperforms training from random initialization, and even supervised ImageNet-21K pretraining of the encoder when the number of labeled images is small. A key advantage of denoising pretraining over supervised pretraining of the backbone is the ability to pretrain the decoder, which would otherwise be randomly initialized. We thus propose a novel Decoder Denoising Pretraining (DDeP) method, in which we initialize the encoder using supervised learning and pretrain only the decoder using the denoising objective. Despite its simplicity, DDeP achieves state-of-the-art results on label-efficient semantic segmentation, offering considerable gains on the Cityscapes, Pascal Context, and ADE20K datasets.

引用

页码：4174 / 4185

页数：12

共 97 条

[1] LEARNING WITH A PROBABILISTIC TEACHER [J].

AGRAWALA, AK .

IEEE TRANSACTIONS ON INFORMATION THEORY, 1970, 16 (04) :373-+

[2]

[Anonymous], 2021, ARXIV210212092

[3]

[Anonymous], 2022, 10 INT C LEARN UNPUB

[4]

[Anonymous], 2021, CVPR, DOI DOI 10.1109/CVPR46437.2021.01001

[5]

[Anonymous], 2020, ARXIV200611239

[6]

[Anonymous], 2020, ARXIV200607733

[7]

[Anonymous], 2020, INT C MACH LEARN

[8]

Bachman P, 2019, ADV NEUR IN, V32

[9]

Bao Hangbo, 2021, PROC INT C LEARN REP

[10] End-to-End Object Detection with Transformers [J].

Carion, Nicolas ;

Massa, Francisco ;

Synnaeve, Gabriel ;

Usunier, Nicolas ;

Kirillov, Alexander ;

Zagoruyko, Sergey .

COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229

← 1 2 3 4 5 6 7 8 9 10 →