Unsupervised cross domain semantic segmentation with mutual refinement and information distillation

被引:2
作者
Ren, Dexin [1 ]
Wang, Shidong [2 ]
Zhang, Zheng [3 ]
Yang, Wankou [4 ]
Ren, Mingwu [1 ]
Zhang, Haofeng [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
[2] Newcastle Univ, Sch Engn, Newcastle Upon Tyne NE17RU, England
[3] Harbin Inst Technol, Sch Comp Sci & Technol, Shenzhen 518055, Peoples R China
[4] Southeast Univ, Sch Automat, Nanjing 210096, Peoples R China
基金
中国国家自然科学基金;
关键词
Unsupervised domain adaptation; Semantic segmentation; Mutual refinement; Information distillation; Curriculum learning; NETWORK;
D O I
10.1016/j.neucom.2024.127641
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unsupervised cross domain semantic segmentation recently has gained much attention, due to its powerful ability of solving the segmentation problem on unlabeled domains. Traditional methods often employ an adversarial network to confuse the source and target inputs, so as to align them in a new feature space. However, these methods cannot well fuse the source and target domain information because the information of the two domains does not really interact with each other and are passed through separate network branches. To tackle this problem, we propose a real interactive learning framework, named Mutual Refinement and Information Distillation (MURID), to align the two domains. Concretely, MURID introduces a Mutual Refinement module in shallow network layers to enhance information sharing and integration between the source and target domains, which can effectively transfer knowledge from source domain to target domain. In addition, in order to avoid using the same structure for testing as for training, which would result in huge computational requirements, we exploit an Information Distillation module to simplify the testing network while maintaining the powerful inference capability of the training. Moreover, we incorporate Curriculum Learning, a self -training mechanism that iteratively trains the network using pseudo -labels obtained from the target domain, to further improve performance. Extensive experiments were conducted on three popular datasets, i.e., GTA5-*Cityscapes and Synthia-*Cityscapes, and the results demonstrate the state-of-the-art performance of our method. Additionally, detailed analysis and ablation studies are also carried out to validate the effectiveness of each designed module.
引用
收藏
页数:14
相关论文
共 86 条
  • [1] Variational Information Distillation for Knowledge Transfer
    Ahn, Sungsoo
    Hu, Shell Xu
    Damianou, Andreas
    Lawrence, Neil D.
    Dai, Zhenwen
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9155 - 9163
  • [2] Ali M., 2020, P EUR C COMP VIS, P290
  • [3] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
    Badrinarayanan, Vijay
    Kendall, Alex
    Cipolla, Roberto
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) : 2481 - 2495
  • [4] Combining Pixel-Level and Structure-Level Adaptation for Semantic Segmentation
    Bi, Xiwen
    Chen, Dubing
    Huang, He
    Wang, Shidong
    Zhang, Haofeng
    [J]. NEURAL PROCESSING LETTERS, 2023, 55 (07) : 9669 - 9684
  • [5] Entropy-weighted reconstruction adversary and curriculum pseudo labeling for domain adaptation in semantic segmentation
    Bi, Xiwen
    Zhang, Xiaohong
    Wang, Shidong
    Zhang, Haofeng
    [J]. NEUROCOMPUTING, 2022, 506 : 277 - 289
  • [6] Chang YT, 2020, PROC CVPR IEEE, P8988, DOI 10.1109/CVPR42600.2020.00901
  • [7] Progressive Feature Alignment for Unsupervised Domain Adaptation
    Chen, Chaoqi
    Xie, Weiping
    Huang, Wenbing
    Rong, Yu
    Ding, Xinghao
    Huang, Yue
    Xu, Tingyang
    Huang, Junzhou
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 627 - 636
  • [8] Chen L., 2020, Computer visionECCV 2020, P347, DOI DOI 10.1007/978-3-030-58574-721
  • [9] Chen LC, 2017, Arxiv, DOI arXiv:1706.05587
  • [10] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
    Chen, Liang-Chieh
    Zhu, Yukun
    Papandreou, George
    Schroff, Florian
    Adam, Hartwig
    [J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851