SSR : SAM is a Strong Regularizer for domain adaptive semantic segmentation

被引：0

作者：

Ge, Yanqi ^{[1
]}

Huang, Ye ^{[1
]}

Li, Wen ^{[1
]}

Duan, Lixin ^{[1
]}

机构：

[1] Univ Elect Sci & Technol China, Shenzhen Inst Adv Study, Shenzhen, Peoples R China

来源：

2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024 | 2024年

关键词：

semantic segmentation; domain adaption;

D O I：

10.1109/CAI59869.2024.00236

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We introduced SSR, which utilizes SAM (segment-anything) as a strong regularizer during training, to greatly enhance the robustness of the image encoder for handling various domains. Specifically, given the fact that SAM is pre-trained with a massive-scale dataset that covers a diverse variety of domains, the feature encoding extracted by the SAM is obviously less dependent on specific domains when compared to the traditional ImageNet pre-trained image encoder. Meanwhile, the ImageNet pre-trained image encoder is still a mature choice of backbone for the semantic segmentation task, especially when the SAM is category-irrelevant. As a result, our SSR provides a simple yet highly effective design. It uses the ImageNet pre-trained image encoder as the backbone, and the intermediate feature of each stage (i.e. there are 4 stages in MiT-B5) is regularized by SAM during training. Extensive experiments show our SSR significantly improved performance over the baseline without introducing any extra inference overhead.

引用

页码：1332 / 1333

页数：2

共 6 条

[1] The Cityscapes Dataset for Semantic Urban Scene Understanding
Cordts, Marius
Omran, Mohamed
Ramos, Sebastian
Rehfeld, Timo
Enzweiler, Markus
Benenson, Rodrigo
Franke, Uwe
Roth, Stefan
Schiele, Bernt
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
[2] MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation
Hoyer, Lukas
Dai, Dengxin
Wang, Haoran
Van Gool, Luc
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 11721 - 11732
[3] DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation
Hoyer, Lukas
Dai, Dengxin
Van Gool, Luc
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 9914 - 9925
[4] Kirillov A., 2023, ICCV
[5] Playing for Data: Ground Truth from Computer Games
Richter, Stephan R.
Vineet, Vibhav
Roth, Stefan
Koltun, Vladlen
[J]. COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 : 102 - 118
[6] Xie EZ, 2021, ADV NEUR IN, V34

← 1 →