Vector-Decomposed Disentanglement for Domain-Invariant Object Detection

被引：77

作者：

Wu, Aming ^{[1
,2
]}

Liu, Rui ^{[1
,2
]}

Han, Yahong ^{[1
,2
,3
]}

Zhu, Linchao ^{[4
]}

Yang, Yi ^{[4
]}

机构：

[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin, Peoples R China

[2] Tianjin Univ, Tianjin Key Lab Machine Learning, Tianjin, Peoples R China

[3] Peng Cheng Lab, Shenzhen, Peoples R China

[4] Univ Technol Sydney, AAII, ReLER Lab, Sydney, NSW, Australia

来源：

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年

关键词：

D O I：

10.1109/ICCV48922.2021.00921

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

To improve the generalization of detectors, for domain adaptive object detection (DAOD), recent advances mainly explore aligning feature-level distributions between the source and single-target domain, which may neglect the impact of domain-specific information existing in the aligned features. Towards DAOD, it is important to extract domain-invariant object representations. To this end, in this paper, we try to disentangle domain-invariant representations from domain-specific representations. And we propose a novel disentangled method based on vector decomposition. Firstly, an extractor is devised to separate domain-invariant representations from the input, which are used for extracting object proposals. Secondly, domain-specific representations are introduced as the differences between the input and domain-invariant representations. Through the difference operation, the gap between the domain-specific and domain-invariant representations is enlarged, which promotes domain-invariant representations to contain more domain-irrelevant information. In the experiment, we separately evaluate our method on the single- and compound-target case. For the single-target case, experimental results of four domain-shift scenes show our method obtains a significant performance gain over baseline methods. Moreover, for the compound-target case (i.e., the target is a compound of two different domains without domain labels), our method outperforms baseline methods by around 4%, which demonstrates the effectiveness of our method.

引用

页码：9322 / 9331

页数：10

共 46 条

[1]

[Anonymous], 2017, P 2017 ACM MULT C, DOI DOI 10.1145/3123266.3123292

[2]

[Anonymous], 2017, P IEEE INT C COMPUTE

[3] Representation Learning: A Review and New Perspectives [J].

Bengio, Yoshua ;

Courville, Aaron ;

Vincent, Pascal .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828

[4]

Cai RC, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P2060

[5]

Chen C., 2020, P IEEE CVF C COMP VI, P8869

[6] Domain Adaptive Faster R-CNN for Object Detection in the Wild [J].

Chen, Yuhua ;

Li, Wen ;

Sakaridis, Christos ;

Dai, Dengxin ;

Van Gool, Luc .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3339-3348

[7] The Cityscapes Dataset for Semantic Urban Scene Understanding [J].

Cordts, Marius ;

Omran, Mohamed ;

Ramos, Sebastian ;

Rehfeld, Timo ;

Enzweiler, Markus ;

Benenson, Rodrigo ;

Franke, Uwe ;

Roth, Stefan ;

Schiele, Bernt .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223

[8] A Risk Assessment Method for Enterprise Cloud Accounting [J].

Deng, Guohua ;

Xu, Chang .

2019 12TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID 2019), 2019, :172-175

[9]

Do Kien, 2019, ARXIV190809961

[10] The Pascal Visual Object Classes (VOC) Challenge [J].

Everingham, Mark ;

Van Gool, Luc ;

Williams, Christopher K. I. ;

Winn, John ;

Zisserman, Andrew .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338

← 1 2 3 4 5 →