Domain Adaptive and Generalizable Network Architectures and Training Strategies for Semantic Image Segmentation

被引:12
作者
Hoyer, Lukas [1 ]
Dai, Dengxin [2 ]
Van Gool, Luc [1 ,3 ,4 ]
机构
[1] Swiss Fed Inst Technol, CH-8092 Zurich, Switzerland
[2] Huawei Zurich Res Ctr, CH-8050 Zurich, Switzerland
[3] Katholieke Univ Leuven, B-3000 Leuven, Belgium
[4] INSAIT, Sofia 1784, Bulgaria
关键词
Domain adaptation; domain generalization; semantic segmentation; transformers; high-resolution; multi-resolution;
D O I
10.1109/TPAMI.2023.3320613
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unsupervised domain adaptation (UDA) and domain generalization (DG) enable machine learning models trained on a source domain to perform well on unlabeled or even unseen target domains. As previous UDA&DG semantic segmentation methods are mostly based on outdated networks, we benchmark more recent architectures, reveal the potential of Transformers, and design the DAFormer network tailored for UDA&DG. It is enabled by three training strategies to avoid overfitting to the source domain: While (1) Rare Class Sampling mitigates the bias toward common source domain classes, (2) a Thing-Class ImageNet Feature Distance and (3) a learning rate warmup promote feature transfer from ImageNet pretraining. As UDA&DG are usually GPU memory intensive, most previous methods downscale or crop images. However, low-resolution predictions often fail to preserve fine details while models trained with cropped images fall short in capturing long-range, domain-robust context information. Therefore, we propose HRDA, a multi-resolution framework for UDA&DG, that combines the strengths of small high-resolution crops to preserve fine segmentation details and large low-resolution crops to capture long-range context dependencies with a learned scale attention. DAFormer and HRDA significantly improve the state-of-the-art UDA&DG by more than 10 mIoU on 5 different benchmarks.
引用
收藏
页码:220 / 235
页数:16
相关论文
共 84 条
[21]  
Hendrycks D., 2019, INT C LEARNING REPRE
[22]   The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization [J].
Hendrycks, Dan ;
Basart, Steven ;
Mu, Norman ;
Kadavath, Saurav ;
Wang, Frank ;
Dorundo, Evan ;
Desai, Rahul ;
Zhu, Tyler ;
Parajuli, Samyak ;
Guo, Mike ;
Song, Dawn ;
Steinhardt, Jacob ;
Gilmer, Justin .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :8320-8329
[23]  
Hoffman J, 2018, PR MACH LEARN RES, V80
[24]  
Hoffman J, 2016, Arxiv, DOI arXiv:1612.02649
[25]  
Hoyer L., 2021, arXiv
[26]  
Hoyer L., 2019, NEURIPS, P6462
[27]   MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation [J].
Hoyer, Lukas ;
Dai, Dengxin ;
Wang, Haoran ;
Van Gool, Luc .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :11721-11732
[28]   HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation [J].
Hoyer, Lukas ;
Dai, Dengxin ;
Van Gool, Luc .
COMPUTER VISION - ECCV 2022, PT XXX, 2022, 13690 :372-391
[29]   DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation [J].
Hoyer, Lukas ;
Dai, Dengxin ;
Van Gool, Luc .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :9914-9925
[30]   ThreeWays to Improve Semantic Segmentation with Self-Supervised Depth Estimation [J].
Hoyer, Lukas ;
Dai, Dengxin ;
Chen, Yuhua ;
Koring, Adrian ;
Saha, Suman ;
Van Gool, Luc .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :11125-11135