Combining Pixel-Level and Structure-Level Adaptation for Semantic Segmentation

被引:3
作者
Bi, Xiwen [1 ]
Chen, Dubing [1 ]
Huang, He [2 ]
Wang, Shidong [3 ]
Zhang, Haofeng [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
[2] Nanjing Res Inst Elect Engn, Dept Data Link & Commun, Nanjing 210007, Peoples R China
[3] Newcastle Univ, Sch Engn, Newcastle Upon Tyne NE17RU, England
基金
中国国家自然科学基金;
关键词
Domain adaptation; Semantic segmentation; Category prototype; Consistency regularization;
D O I
10.1007/s11063-023-11220-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Domain adaptation for semantic segmentation requires pixel-level knowledge transfer from a labeled source domain to an unlabeled target domain. Existing approaches typically align the features of the source and target domains at different levels. However, they usually neglect the different adaptive complexities of different information flows within images. In this paper, we focus on combining two main information flows in semantic segmentation, ie., the pixel-level disparate information and image structure information. Specifically, we propose to combine two feature map-based prediction heads, which are thought to focus on pixel level and structure-level information, to accommodate different complexities by adjusting the attention to adaptation functions of the target domain. We then align the outputs from the two heads through a consistency regularization to realize informative complementarity. The combined prediction head further enables regularizing the distance between different pixel representations of different classes, thereby mitigating the mis-adaptation problem of similar classes. The proposed method can achieve more competitive results than current state-ofthe-art results on two publicly available benchmark datasets, ie., SYNTHIA -> Cityscapes and GTA5 -> Cityscapes.
引用
收藏
页码:9669 / 9684
页数:16
相关论文
共 64 条
  • [11] Ganin Y, 2016, J MACH LEARN RES, V17
  • [12] Generative Adversarial Networks
    Goodfellow, Ian
    Pouget-Abadie, Jean
    Mirza, Mehdi
    Xu, Bing
    Warde-Farley, David
    Ozair, Sherjil
    Courville, Aaron
    Bengio, Yoshua
    [J]. COMMUNICATIONS OF THE ACM, 2020, 63 (11) : 139 - 144
  • [13] HE KM, 2016, PROC CVPR IEEE, P770, DOI [DOI 10.1109/CVPR.2016.90, 10.1109/CVPR.2016.90]
  • [14] Hoffman J., 2016, ARXIV
  • [15] Hoffman J, 2018, PR MACH LEARN RES, V80
  • [16] Jianfei Yang, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12369), P589, DOI 10.1007/978-3-030-58586-0_35
  • [17] Kim M, 2020, PROC CVPR IEEE, P12972, DOI 10.1109/CVPR42600.2020.01299
  • [18] Kingma DP., 2014, ARXIV, DOI DOI 10.48550/ARXIV.1412.6980
  • [19] Domain Adaptive Knowledge Distillation for Driving Scene Semantic Segmentation
    Kothandaraman, Divya
    Nambiar, Athira
    Mittal, Anurag
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS (WACVW 2021), 2021, : 134 - 143
  • [20] ImageNet Classification with Deep Convolutional Neural Networks
    Krizhevsky, Alex
    Sutskever, Ilya
    Hinton, Geoffrey E.
    [J]. COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90