Per-class curriculum for Unsupervised Domain Adaptation in semantic segmentation

被引:0
作者
Alcover-Couso, Roberto [1 ]
Sanmiguel, Juan C. [1 ]
Escudero-Vinolo, Marcos [1 ]
Carballeira, Pablo [1 ]
机构
[1] Univ Autonoma Madrid UAM, Video Proc & Understanding Lab, Madrid 28049, Spain
关键词
Semantic Segmentation; Unsupervised Domain Adaptation; Curriculum learning; Synthetic data;
D O I
10.1007/s00371-024-03373-8
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Accurate training of deep neural networks for semantic segmentation requires a large number of pixel-level annotations of real images, which are expensive to generate or not even available. In this context, Unsupervised Domain Adaptation (UDA) can transfer knowledge from unlimited synthetic annotations to unlabeled real images of a given domain. UDA methods are composed of an initial training stage with labeled synthetic data followed by a second stage for feature alignment between labeled synthetic and unlabeled real data. In this paper, we propose a novel approach for UDA focusing the initial training stage, which leads to increased performance after adaptation. We introduce a curriculum strategy where each semantic class is learned progressively. Thereby, better features are obtained for the second stage. This curriculum is based on: (1) a class-scoring function to determine the difficulty of each semantic class, (2) a strategy for incremental learning based on scoring and pacing functions that limits the required training time unlike standard curriculum-based training and (3) a training loss to operate at class level. We extensively evaluate our approach as the first stage of several state-of-the-art UDA methods for semantic segmentation. Our results demonstrate significant performance enhancements across all methods: improvements of up to 10% for entropy-based techniques and 8% for adversarial methods. These findings underscore the dependency of UDA on the accuracy of the initial training. The implementation is available at https://github.com/vpulab/PCCL.
引用
收藏
页码:901 / 919
页数:19
相关论文
共 51 条
[1]   Road scenes segmentation across different domains by disentangling latent representations [J].
Barbato, Francesco ;
Michieli, Umberto ;
Toldo, Marco ;
Zanuttigh, Pietro .
VISUAL COMPUTER, 2024, 40 (02) :811-830
[2]   A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets [J].
Bayoudh, Khaled ;
Knani, Raja ;
Hamdaoui, Faycal ;
Mtibaa, Abdellatif .
VISUAL COMPUTER, 2022, 38 (08) :2939-2970
[3]  
Bengio Y., 2009, P 26 ANN INT C MACHI, P41
[4]   Unsupervised Domain Adaptation for Semantic Segmentation of Urban Scenes [J].
Biasetton, Matteo ;
Michieli, Umberto ;
Agresti, Gianluca ;
Zanuttigh, Pietro .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, :1211-1220
[5]   InverseForm: A Loss Function for Structured Boundary-Aware Segmentation [J].
Borse, Shubhankar ;
Wang, Ying ;
Zhang, Yizhe ;
Porikli, Fatih .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :5897-5907
[6]  
Chen LC, 2018, IEEE INT C ELECTR TA
[7]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851
[8]   Domain Adaptation for Semantic Segmentation with Maximum Squares Loss [J].
Chen, Minghao ;
Xue, Hongyang ;
Cai, Deng .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :2090-2099
[9]   Local to Global Learning: Gradually Adding Classes for Training Deep Neural Networks [J].
Cheng, Hao ;
Lian, Dongze ;
Deng, Bowen ;
Gao, Shenghua ;
Tan, Tao ;
Geng, Yanlin .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4743-4751
[10]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223