Multi-view Adversarial Discriminator: Mine the Non-causal Factors for Object Detection in Unseen Domains

被引：29

作者：

Xu, Mingjun ^{[1
]}

Qin, Lingyun ^{[1
]}

Chen, Weijie ^{[2
]}

Pu, Shiliang ^{[2
]}

Zhang, Lei ^{[1
]}

机构：

[1] Chongqing Univ, Sch Microelect & Commun Engn, Chongqing, Peoples R China

[2] Hikvision Res Inst, Hangzhou, Peoples R China

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年

基金：

国家重点研发计划;

关键词：

D O I：

10.1109/CVPR52729.2023.00783

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Domain shift degrades the performance of object detection models in practical applications. To alleviate the influence of domain shift, plenty of previous work try to decouple and learn the domain-invariant (common) features from source domains via domain adversarial learning (DAL). However, inspired by causal mechanisms, we find that previous methods ignore the implicit insignificant non-causal factors hidden in the common features. This is mainly due to the single-view nature of DAL. In this work, we present an idea to remove non-causal factors from common features by multi-view adversarial training on source domains, because we observe that such insignificant non-causal factors may still be significant in other latent spaces (views) due to the multi-mode structure of data. To summarize, we propose a Multi-view Adversarial Discriminator (MAD) based domain generalization model, consisting of a Spurious Correlations Generator (SCG) that increases the diversity of source domain by random augmentation and a Multi-View Domain Classifier (MVDC) that maps features to multiple latent spaces, such that the non-causal factors are removed and the domain-invariant features are purified. Extensive experiments on six benchmarks show our MAD obtains state-of-the-art performance.

引用

页码：8103 / 8112

页数：10

共 52 条

[1] DISCRETE COSINE TRANSFORM [J].

AHMED, N ;

NATARAJAN, T ;

RAO, KR .

IEEE TRANSACTIONS ON COMPUTERS, 1974, C 23 (01) :90-93

[2] Cross-Domain Car Detection Using Unsupervised Image-to-Image Translation: From Day to Night [J].

Arruda, Vinicius F. ;

Paixao, Thiago M. ;

Berriel, Rodrigo F. ;

De Souza, Alberto F. ;

Badue, Claudine ;

Sebe, Nicu ;

Oliveira-Santos, Thiago .

2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,

[3] Exploring Object Relation in Mean Teacher for Cross-Domain Detection [J].

Cai, Qi ;

Pan, Yingwei ;

Ngo, Chong-Wah ;

Tian, Xinmei ;

Duan, Lingyu ;

Yao, Ting .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :11449-11458

[4]

Chang Shiyu, 2020, P 37 INT C MACH LEAR, P1448

[5]

Chen Lei, 2021, Deep Learning and Practice with MindSpore, P6

[6] Domain Adaptive Faster R-CNN for Object Detection in the Wild [J].

Chen, Yuhua ;

Li, Wen ;

Sakaridis, Christos ;

Dai, Dengxin ;

Van Gool, Luc .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3339-3348

[7] The Cityscapes Dataset for Semantic Urban Scene Understanding [J].

Cordts, Marius ;

Omran, Mohamed ;

Ramos, Sebastian ;

Rehfeld, Timo ;

Enzweiler, Markus ;

Benenson, Rodrigo ;

Franke, Uwe ;

Roth, Stefan ;

Schiele, Bernt .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223

[8] The PASCAL Visual Object Classes Challenge: A Retrospective [J].

Everingham, Mark ;

Eslami, S. M. Ali ;

Van Gool, Luc ;

Williams, Christopher K. I. ;

Winn, John ;

Zisserman, Andrew .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) :98-136

[9]

Ganin Y, 2015, PR MACH LEARN RES, V37, P1180

[10] Vision meets robotics: The KITTI dataset [J].

Geiger, A. ;

Lenz, P. ;

Stiller, C. ;

Urtasun, R. .

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) :1231-1237

← 1 2 3 4 5 6 →