Multi-view Adversarial Discriminator: Mine the Non-causal Factors for Object Detection in Unseen Domains

被引：31

作者：

Xu, Mingjun ^{[1
]}

Qin, Lingyun ^{[1
]}

Chen, Weijie ^{[2
]}

Pu, Shiliang ^{[2
]}

Zhang, Lei ^{[1
]}

机构：

[1] Chongqing Univ, Sch Microelect & Commun Engn, Chongqing, Peoples R China

[2] Hikvision Res Inst, Hangzhou, Peoples R China

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年

基金：

国家重点研发计划;

关键词：

D O I：

10.1109/CVPR52729.2023.00783

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Domain shift degrades the performance of object detection models in practical applications. To alleviate the influence of domain shift, plenty of previous work try to decouple and learn the domain-invariant (common) features from source domains via domain adversarial learning (DAL). However, inspired by causal mechanisms, we find that previous methods ignore the implicit insignificant non-causal factors hidden in the common features. This is mainly due to the single-view nature of DAL. In this work, we present an idea to remove non-causal factors from common features by multi-view adversarial training on source domains, because we observe that such insignificant non-causal factors may still be significant in other latent spaces (views) due to the multi-mode structure of data. To summarize, we propose a Multi-view Adversarial Discriminator (MAD) based domain generalization model, consisting of a Spurious Correlations Generator (SCG) that increases the diversity of source domain by random augmentation and a Multi-View Domain Classifier (MVDC) that maps features to multiple latent spaces, such that the non-causal factors are removed and the domain-invariant features are purified. Extensive experiments on six benchmarks show our MAD obtains state-of-the-art performance.

引用

页码：8103 / 8112

页数：10

共 52 条

[11] Fast R-CNN [J].

Girshick, Ross .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448

[12] Rich feature hierarchies for accurate object detection and semantic segmentation [J].

Girshick, Ross ;

Donahue, Jeff ;

Darrell, Trevor ;

Malik, Jitendra .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587

[13]

Glymour Madelyn, 2016, Causal inference in statistics: A primer, P3

[14] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[15]

Heinze-Deml C, 2019, Arxiv, DOI [arXiv:1710.11469, 10.48550/arXiv.1710.11469]

[16] Reducing the dimensionality of data with neural networks [J].

Hinton, G. E. ;

Salakhutdinov, R. R. .

SCIENCE, 2006, 313 (5786) :504-507

[17] Every Pixel Matters: Center-Aware Feature Alignment for Domain Adaptive Object Detector [J].

Hsu, Cheng-Chun ;

Tsai, Yi-Hsuan ;

Lin, Yen-Yu ;

Yang, Ming-Hsuan .

COMPUTER VISION - ECCV 2020, PT IX, 2020, 12354 :733-748

[18] Depth-attentional Features for Single-image Rain Removal [J].

Hu, Xiaowei ;

Fu, Chi-Wing ;

Zhu, Lei ;

Heng, Pheng-Ann .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :8014-8023

[19] FSDR: Frequency Space Domain Randomization for Domain Generalization [J].

Huang, Jiaxing ;

Guan, Dayan ;

Xiao, Aoran ;

Lu, Shijian .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :6887-6898

[20]

Johnson-Roberson M, 2017, Arxiv, DOI [arXiv:1610.01983, 10.1109/icra.2017.7989092, DOI 10.1109/ICRA.2017.7989092]

← 1 2 3 4 5 6 →