Combining Pixel-Level and Structure-Level Adaptation for Semantic Segmentation

被引：3

作者：

Bi, Xiwen ^{[1
]}

Chen, Dubing ^{[1
]}

Huang, He ^{[2
]}

Wang, Shidong ^{[3
]}

Zhang, Haofeng ^{[1
]}

机构：

[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China

[2] Nanjing Res Inst Elect Engn, Dept Data Link & Commun, Nanjing 210007, Peoples R China

[3] Newcastle Univ, Sch Engn, Newcastle Upon Tyne NE17RU, England

来源：

NEURAL PROCESSING LETTERS | 2023年 / 55卷 / 07期

基金：

中国国家自然科学基金;

关键词：

Domain adaptation; Semantic segmentation; Category prototype; Consistency regularization;

D O I：

10.1007/s11063-023-11220-5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Domain adaptation for semantic segmentation requires pixel-level knowledge transfer from a labeled source domain to an unlabeled target domain. Existing approaches typically align the features of the source and target domains at different levels. However, they usually neglect the different adaptive complexities of different information flows within images. In this paper, we focus on combining two main information flows in semantic segmentation, ie., the pixel-level disparate information and image structure information. Specifically, we propose to combine two feature map-based prediction heads, which are thought to focus on pixel level and structure-level information, to accommodate different complexities by adjusting the attention to adaptation functions of the target domain. We then align the outputs from the two heads through a consistency regularization to realize informative complementarity. The combined prediction head further enables regularizing the distance between different pixel representations of different classes, thereby mitigating the mis-adaptation problem of similar classes. The proposed method can achieve more competitive results than current state-ofthe-art results on two publicly available benchmark datasets, ie., SYNTHIA -> Cityscapes and GTA5 -> Cityscapes.

引用

页码：9669 / 9684

页数：16

共 64 条

[11]

Ganin Y, 2016, J MACH LEARN RES, V17

[12] Generative Adversarial Networks [J].

Goodfellow, Ian ;

Pouget-Abadie, Jean ;

Mirza, Mehdi ;

Xu, Bing ;

Warde-Farley, David ;

Ozair, Sherjil ;

Courville, Aaron ;

Bengio, Yoshua .

COMMUNICATIONS OF THE ACM, 2020, 63 (11) :139-144

[13]

He K., 2015, CORR, Vabs/1502.01852, DOI DOI 10.1109/CVPR.2016.90

[14]

Hoffman J, 2018, PR MACH LEARN RES, V80

[15]

Hoffman Judy, 2016, arXiv

[16] Learning Texture Invariant Representation for Domain Adaptation of Semantic Segmentation [J].

Kim, Myeongjin ;

Byun, Hyeran .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :12972-12981

[17]

Kingma D.P., 2015, INT C LEARN REPRESEN, P1, DOI [10.48550/ARXIV.1412.6980, DOI 10.48550/ARXIV.1412.6980]

[18] Domain Adaptive Knowledge Distillation for Driving Scene Semantic Segmentation [J].

Kothandaraman, Divya ;

Nambiar, Athira ;

Mittal, Anurag .

2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS (WACVW 2021), 2021, :134-143

[19] ImageNet Classification with Deep Convolutional Neural Networks [J].

Krizhevsky, Alex ;

Sutskever, Ilya ;

Hinton, Geoffrey E. .

COMMUNICATIONS OF THE ACM, 2017, 60 (06) :84-90

[20] Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation [J].

Lee, Chen-Yu ;

Batra, Tanmay ;

Baig, Mohammad Haris ;

Ulbricht, Daniel .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :10277-10287

← 1 2 3 4 5 6 7 →