ADPL: Adaptive Dual Path Learning for Domain Adaptation of Semantic Segmentation

被引:20
作者
Cheng, Yiting [1 ]
Wei, Fangyun [2 ]
Bao, Jianmin [2 ]
Chen, Dong [2 ]
Zhang, Wenqiang [1 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai 200437, Peoples R China
[2] Microsoft Res Asia, Beijing 100080, Peoples R China
基金
中国国家自然科学基金;
关键词
Domain adaptation; semantic segmentation; image translation; self-supervised learning;
D O I
10.1109/TPAMI.2023.3248294
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To alleviate the need for large-scale pixel-wise annotations, domain adaptation for semantic segmentation trains segmentation models on synthetic data (source) with computer-generated annotations, which can be then generalized to segment realistic images (target). Recently, self-supervised learning (SSL) with a combination of image-to-image translation shows great effectiveness in adaptive segmentation. The most common practice is to perform SSL along with image translation to well align a single domain (source or target). However, in this single-domain paradigm, unavoidable visual inconsistency raised by image translation may affect subsequent learning. In addition, pseudo labels generated by a single segmentation model aligned in either the source or target domain may be not accurate enough for SSL. In this paper, based on the observation that domain adaptation frameworks performed in the source and target domain are almost complementary, we propose a novel adaptive dual path learning (ADPL) framework to alleviate visual inconsistency and promote pseudo-labeling by introducing two interactive single-domain adaptation paths aligned in source and target domain respectively. To fully explore the potential of this dual-path design, novel technologies such as dual path image translation (DPIT), dual path adaptive segmentation (DPAS), dual path pseudo label generation (DPPLG) and Adaptive ClassMix are proposed. The inference of ADPL is extremely simple, only one segmentation model in the target domain is employed. Our ADPL outperforms the state-of-the-art methods by large margins on GTA5 -> Cityscapes, SYNTHIA -> Cityscapes and GTA5 -> BDD100K scenarios. Code and models are available at https:// github.com/royee182/DPL.
引用
收藏
页码:9339 / 9356
页数:18
相关论文
共 77 条
  • [1] [Anonymous], 2005, ADV NEURAL INFORM PR
  • [2] DUNIT: Detection-based Unsupervised Image-to-Image Translation
    Bhattacharjee, Deblina
    Kim, Seungryong
    Vizier, Guillaume
    Salzmann, Mathieu
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 4786 - 4795
  • [3] All about Structure: Adapting Structural Information across Domains for Boosting Semantic Segmentation
    Chang, Wei-Lun
    Wang, Hui-Po
    Peng, Wen-Hsiao
    Chiu, Wei-Chen
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1900 - 1909
  • [4] Progressive Feature Alignment for Unsupervised Domain Adaptation
    Chen, Chaoqi
    Xie, Weiping
    Huang, Wenbing
    Rong, Yu
    Ding, Xinghao
    Huang, Yue
    Xu, Tingyang
    Huang, Junzhou
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 627 - 636
  • [5] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [6] Chen L, 2022, Arxiv, DOI arXiv:2209.07695
  • [7] ROAD: Reality Oriented Adaptation for Semantic Segmentation of Urban Scenes
    Chen, Yuhua
    Li, Wen
    Van Gool, Luc
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7892 - 7901
  • [8] Domain Adaptive Faster R-CNN for Object Detection in the Wild
    Chen, Yuhua
    Li, Wen
    Sakaridis, Christos
    Dai, Dengxin
    Van Gool, Luc
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 3339 - 3348
  • [9] Cheng Y., 2021, ICCV, P9082
  • [10] The Cityscapes Dataset for Semantic Urban Scene Understanding
    Cordts, Marius
    Omran, Mohamed
    Ramos, Sebastian
    Rehfeld, Timo
    Enzweiler, Markus
    Benenson, Rodrigo
    Franke, Uwe
    Roth, Stefan
    Schiele, Bernt
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223