Domain Adaptive and Generalizable Network Architectures and Training Strategies for Semantic Image Segmentation

被引：12

作者：

Hoyer, Lukas ^{[1
]}

Dai, Dengxin ^{[2
]}

Van Gool, Luc ^{[1
,3
,4
]}

机构：

[1] Swiss Fed Inst Technol, CH-8092 Zurich, Switzerland

[2] Huawei Zurich Res Ctr, CH-8050 Zurich, Switzerland

[3] Katholieke Univ Leuven, B-3000 Leuven, Belgium

[4] INSAIT, Sofia 1784, Bulgaria

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2024年 / 46卷 / 01期

关键词：

Domain adaptation; domain generalization; semantic segmentation; transformers; high-resolution; multi-resolution;

D O I：

10.1109/TPAMI.2023.3320613

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Unsupervised domain adaptation (UDA) and domain generalization (DG) enable machine learning models trained on a source domain to perform well on unlabeled or even unseen target domains. As previous UDA&DG semantic segmentation methods are mostly based on outdated networks, we benchmark more recent architectures, reveal the potential of Transformers, and design the DAFormer network tailored for UDA&DG. It is enabled by three training strategies to avoid overfitting to the source domain: While (1) Rare Class Sampling mitigates the bias toward common source domain classes, (2) a Thing-Class ImageNet Feature Distance and (3) a learning rate warmup promote feature transfer from ImageNet pretraining. As UDA&DG are usually GPU memory intensive, most previous methods downscale or crop images. However, low-resolution predictions often fail to preserve fine details while models trained with cropped images fall short in capturing long-range, domain-robust context information. Therefore, we propose HRDA, a multi-resolution framework for UDA&DG, that combines the strengths of small high-resolution crops to preserve fine segmentation details and large low-resolution crops to capture long-range context dependencies with a learned scale attention. DAFormer and HRDA significantly improve the state-of-the-art UDA&DG by more than 10 mIoU on 5 different benchmarks.

引用

页码：220 / 235

页数：16

共 84 条

[21]

Hendrycks D., 2019, INT C LEARNING REPRE

[22] The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization [J].

Hendrycks, Dan ;

Basart, Steven ;

Mu, Norman ;

Kadavath, Saurav ;

Wang, Frank ;

Dorundo, Evan ;

Desai, Rahul ;

Zhu, Tyler ;

Parajuli, Samyak ;

Guo, Mike ;

Song, Dawn ;

Steinhardt, Jacob ;

Gilmer, Justin .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :8320-8329

[23]

Hoffman J, 2018, PR MACH LEARN RES, V80

[24]

Hoffman J, 2016, Arxiv, DOI arXiv:1612.02649

[25]

Hoyer L., 2021, arXiv

[26]

Hoyer L., 2019, NEURIPS, P6462

[27] MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation [J].

Hoyer, Lukas ;

Dai, Dengxin ;

Wang, Haoran ;

Van Gool, Luc .

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :11721-11732

[28] HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation [J].

Hoyer, Lukas ;

Dai, Dengxin ;

Van Gool, Luc .

COMPUTER VISION - ECCV 2022, PT XXX, 2022, 13690 :372-391

[29] DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation [J].

Hoyer, Lukas ;

Dai, Dengxin ;

Van Gool, Luc .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :9914-9925

[30] ThreeWays to Improve Semantic Segmentation with Self-Supervised Depth Estimation [J].

Hoyer, Lukas ;

Dai, Dengxin ;

Chen, Yuhua ;

Koring, Adrian ;

Saha, Suman ;

Van Gool, Luc .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :11125-11135

← 1 2 3 4 5 6 7 8 9 →