Threshold-Adaptive Unsupervised Focal Loss for Domain Adaptation of Semantic Segmentation

被引:15
作者
Yan, Weihao [1 ]
Qian, Yeqiang [2 ]
Wang, Chunxiang [1 ]
Yang, Ming [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Automat, Key Lab Syst Control & Informat Proc, Minist Educ China, Shanghai 200240, Peoples R China
[2] Shanghai Jiao Tong Univ, Univ Michigan Shanghai Jiao Tong Univ Joint Inst, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
Training; Entropy; Semantics; Adaptation models; Real-time systems; Urban areas; Predictive models; Semantic segmentation; unsupervised domain adaptation; entropy minimization; focal loss;
D O I
10.1109/TITS.2022.3210759
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Semantic segmentation is an important task for intelligent vehicles to understand the environment. Current deep learning based methods require large amounts of labeled data for training. Manual annotation is expensive, while simulators can provide accurate annotations. However, the performance of the semantic segmentation model trained with synthetic datasets will significantly degenerate in the actual scenes. Unsupervised domain adaptation (UDA) for semantic segmentation is used to reduce the domain gap and improve the performance on the target domain. Existing adversarial-based and self-training methods usually involve complex training procedures, while entropy-based methods have recently received attention for their simplicity and effectiveness. However, entropy-based UDA methods have problems that they barely optimize hard samples and lack an explicit semantic connection between the source and target domains. In this paper, we propose a novel two-stage entropy-based UDA method for semantic segmentation. In stage one, we design a threshold-adaptative unsupervised focal loss to regularize the prediction in the target domain. It first introduces unsupervised focal loss into UDA for semantic segmentation, helping to optimize hard samples and avoiding generating unreliable pseudo-labels in the target domain. In stage two, we employ cross-domain image mixing (CIM) to bridge the semantic knowledge between two domains and incorporate long-tail class pasting to alleviate the class imbalance problem. Extensive experiments on synthetic-to-real and cross-city benchmarks demonstrate the effectiveness of our method. It achieves state-of-the-art performance using DeepLabV2, as well as competitive performance using the lightweight BiSeNet with great advantages in training and inference time.
引用
收藏
页码:752 / 763
页数:12
相关论文
共 55 条
[1]   Self-supervised Augmentation Consistency for Adapting Semantic Segmentation [J].
Araslanov, Nikita ;
Roth, Stefan .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :15379-15389
[2]  
Chapelle O., 2005, AISTATS, P57
[3]   Importance-Aware Semantic Segmentation for Autonomous Vehicles [J].
Chen, Bike ;
Gong, Chen ;
Yang, Jian .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2019, 20 (01) :137-148
[4]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[5]   Domain Adaptation for Semantic Segmentation with Maximum Squares Loss [J].
Chen, Minghao ;
Xue, Hongyang ;
Cai, Deng .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :2090-2099
[6]  
Chen Runnan, 2022, arXiv
[7]   No More Discrimination: Cross City Adaptation of Road Scene Segmenters [J].
Chen, Yi-Hsin ;
Chen, Wei-Yu ;
Chen, Yu-Ting ;
Tsai, Bo-Cheng ;
Wang, Yu-Chiang Frank ;
Sun, Min .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :2011-2020
[8]   CrDoCo: Pixel-level Domain Transfer with Cross-Domain Consistency [J].
Chen, Yun-Chun ;
Lin, Yen-Yu ;
Yang, Ming-Hsuan ;
Huang, Jia-Bin .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :1791-1800
[9]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[10]   Restricted Deformable Convolution-Based Road Scene Semantic Segmentation Using Surround View Cameras [J].
Deng, Liuyuan ;
Yang, Ming ;
Li, Hao ;
Li, Tianyi ;
Hu, Bing ;
Wang, Chunxiang .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2020, 21 (10) :4350-4362