Geometric Unsupervised Domain Adaptation for Semantic Segmentation

被引:16
作者
Guizilini, Vitor [1 ]
Li, Jie [1 ]
Ambrus, Rares [1 ]
Gaidon, Adrien [1 ]
机构
[1] Toyota Res Inst TRI, Los Altos, CA 94022 USA
来源
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年
关键词
D O I
10.1109/ICCV48922.2021.00842
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Simulators can efficiently generate large amounts of labeled synthetic data with perfect supervision for hard-to-label tasks like semantic segmentation. However, they introduce a domain gap that severely hurts real-world performance. We propose to use self-supervised monocular depth estimation as a proxy task to bridge this gap and improve sim-to-real unsupervised domain adaptation (UDA). Our Geometric Unsupervised Domain Adaptation method (GUDA)(1) learns a domain-invariant representation via a multi-task objective combining synthetic semantic supervision with real-world geometric constraints on videos. GUDA establishes a new state of the art in UDA for semantic segmentation on three benchmarks, outperforming methods that use domain adversarial learning, self-training, or other self-supervised proxy tasks. Furthermore, we show that our method scales well with the quality and quantity of synthetic data while also improving depth prediction.
引用
收藏
页码:8517 / 8527
页数:11
相关论文
共 77 条
[1]   Augmented Reality Meets Computer Vision: Efficient Data Generation for Urban Driving Scenes [J].
Abu Alhaija, Hassan ;
Mustikovela, Siva Karthik ;
Mescheder, Lars ;
Geiger, Andreas ;
Rother, Carsten .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2018, 126 (09) :961-972
[2]  
[Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.00262
[3]  
[Anonymous], 2016, CoRR abs/1604.04339
[4]   Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks [J].
Bousmalis, Konstantinos ;
Silberman, Nathan ;
Dohan, David ;
Erhan, Dumitru ;
Krishnan, Dilip .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :95-104
[5]  
Cabon Y., 2020, Virtual kitti 2
[6]   Domain Generalization by Solving Jigsaw Puzzles [J].
Carlucci, Fabio M. ;
D'Innocente, Antonio ;
Bucci, Silvia ;
Caputo, Barbara ;
Tommasi, Tatiana .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :2224-2233
[7]   Destruction and Construction Learning for Fine-grained Image Recognition [J].
Chen, Yue ;
Bai, Yalong ;
Zhang, Wei ;
Mei, Tao .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5152-5161
[8]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[9]  
Csurka G., 2017, ARXIV170205374
[10]   Generating Human Action Videos by Coupling 3D Game Engines and Probabilistic Graphical Models [J].
de Souza, Cesar Roberto ;
Gaidon, Adrien ;
Cabon, Yohann ;
Murray, Naila ;
Manuel Lopez, Antonio .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (05) :1505-1536