Diversify and Match: A Domain Adaptive Representation Learning Paradigm for Object Detection

被引:242
作者
Kim, Taekyung [1 ]
Jeong, Minki [1 ]
Kim, Seunghyeon [1 ]
Choi, Seokeon [1 ]
Kim, Changick [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Daejeon, South Korea
来源
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年
关键词
D O I
10.1109/CVPR.2019.01274
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a novel unsupervised domain adaptation approach for object detection. We aim to alleviate the imperfect translation problem of pixel-level adaptations, and the source-biased discriminativity problem of feature-level adaptations simultaneously. Our approach is composed of two stages, i.e., Domain Diversification (DD) and Multi-domain-invariant Representation Learning (MRL). At the DD stage, we diversify the distribution of the labeled data by generating various distinctive shifted domains from the source domain. At the MRL stage, we apply adversarial learning with a multi-domain discriminator to encourage feature to be indistinguishable among the domains. DD addresses the source-biased discriminativity, while MRL mitigates the imperfect image translation. We construct a structured domain adaptation framework for our learning paradigm and introduce a practical way of DD for implementation. Our method outperforms the state-of-the-art methods by a large margin of 3%similar to 11% in terms of mean average precision (mAP) on various datasets.
引用
收藏
页码:12448 / 12457
页数:10
相关论文
共 48 条
[1]   Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks [J].
Bousmalis, Konstantinos ;
Silberman, Nathan ;
Dohan, David ;
Erhan, Dumitru ;
Krishnan, Dilip .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :95-104
[2]   AutoDIAL: Automatic DomaIn Alignment Layers [J].
Carlucci, Fabio Maria ;
Porzi, Lorenzo ;
Caputo, Barbara ;
Ricci, Elisa ;
Bulo, Samuel Rota .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5077-5085
[3]  
Chen Y., 2017, CoRR
[4]   Less Is More: Picking Informative Frames for Video Captioning [J].
Chen, Yangyu ;
Wang, Shuhui ;
Zhang, Weigang ;
Huang, Qingming .
COMPUTER VISION - ECCV 2018, PT XIII, 2018, 11217 :367-384
[5]   Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning [J].
Chen, Yuhua ;
Pont-Tuset, Jordi ;
Montes, Alberto ;
Van Gool, Luc .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1189-1198
[6]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[7]  
Dai J., 2016, ADV NEURAL INFORM PR, P379, DOI DOI 10.1109/CVPR.2017.690
[8]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[9]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338
[10]  
Ganin Y, 2015, PR MACH LEARN RES, V37, P1180