Multi-view Adversarial Discriminator: Mine the Non-causal Factors for Object Detection in Unseen Domains

被引:29
作者
Xu, Mingjun [1 ]
Qin, Lingyun [1 ]
Chen, Weijie [2 ]
Pu, Shiliang [2 ]
Zhang, Lei [1 ]
机构
[1] Chongqing Univ, Sch Microelect & Commun Engn, Chongqing, Peoples R China
[2] Hikvision Res Inst, Hangzhou, Peoples R China
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年
基金
国家重点研发计划;
关键词
D O I
10.1109/CVPR52729.2023.00783
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Domain shift degrades the performance of object detection models in practical applications. To alleviate the influence of domain shift, plenty of previous work try to decouple and learn the domain-invariant (common) features from source domains via domain adversarial learning (DAL). However, inspired by causal mechanisms, we find that previous methods ignore the implicit insignificant non-causal factors hidden in the common features. This is mainly due to the single-view nature of DAL. In this work, we present an idea to remove non-causal factors from common features by multi-view adversarial training on source domains, because we observe that such insignificant non-causal factors may still be significant in other latent spaces (views) due to the multi-mode structure of data. To summarize, we propose a Multi-view Adversarial Discriminator (MAD) based domain generalization model, consisting of a Spurious Correlations Generator (SCG) that increases the diversity of source domain by random augmentation and a Multi-View Domain Classifier (MVDC) that maps features to multiple latent spaces, such that the non-causal factors are removed and the domain-invariant features are purified. Extensive experiments on six benchmarks show our MAD obtains state-of-the-art performance.
引用
收藏
页码:8103 / 8112
页数:10
相关论文
共 52 条
[1]   DISCRETE COSINE TRANSFORM [J].
AHMED, N ;
NATARAJAN, T ;
RAO, KR .
IEEE TRANSACTIONS ON COMPUTERS, 1974, C 23 (01) :90-93
[2]   Cross-Domain Car Detection Using Unsupervised Image-to-Image Translation: From Day to Night [J].
Arruda, Vinicius F. ;
Paixao, Thiago M. ;
Berriel, Rodrigo F. ;
De Souza, Alberto F. ;
Badue, Claudine ;
Sebe, Nicu ;
Oliveira-Santos, Thiago .
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[3]   Exploring Object Relation in Mean Teacher for Cross-Domain Detection [J].
Cai, Qi ;
Pan, Yingwei ;
Ngo, Chong-Wah ;
Tian, Xinmei ;
Duan, Lingyu ;
Yao, Ting .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :11449-11458
[4]  
Chang Shiyu, 2020, P 37 INT C MACH LEAR, P1448
[5]  
Chen Lei, 2021, Deep Learning and Practice with MindSpore, P6
[6]   Domain Adaptive Faster R-CNN for Object Detection in the Wild [J].
Chen, Yuhua ;
Li, Wen ;
Sakaridis, Christos ;
Dai, Dengxin ;
Van Gool, Luc .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3339-3348
[7]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[8]   The PASCAL Visual Object Classes Challenge: A Retrospective [J].
Everingham, Mark ;
Eslami, S. M. Ali ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) :98-136
[9]  
Ganin Y, 2015, PR MACH LEARN RES, V37, P1180
[10]   Vision meets robotics: The KITTI dataset [J].
Geiger, A. ;
Lenz, P. ;
Stiller, C. ;
Urtasun, R. .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) :1231-1237