Vector-Decomposed Disentanglement for Domain-Invariant Object Detection

被引:77
作者
Wu, Aming [1 ,2 ]
Liu, Rui [1 ,2 ]
Han, Yahong [1 ,2 ,3 ]
Zhu, Linchao [4 ]
Yang, Yi [4 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin, Peoples R China
[2] Tianjin Univ, Tianjin Key Lab Machine Learning, Tianjin, Peoples R China
[3] Peng Cheng Lab, Shenzhen, Peoples R China
[4] Univ Technol Sydney, AAII, ReLER Lab, Sydney, NSW, Australia
来源
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年
关键词
D O I
10.1109/ICCV48922.2021.00921
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To improve the generalization of detectors, for domain adaptive object detection (DAOD), recent advances mainly explore aligning feature-level distributions between the source and single-target domain, which may neglect the impact of domain-specific information existing in the aligned features. Towards DAOD, it is important to extract domain-invariant object representations. To this end, in this paper, we try to disentangle domain-invariant representations from domain-specific representations. And we propose a novel disentangled method based on vector decomposition. Firstly, an extractor is devised to separate domain-invariant representations from the input, which are used for extracting object proposals. Secondly, domain-specific representations are introduced as the differences between the input and domain-invariant representations. Through the difference operation, the gap between the domain-specific and domain-invariant representations is enlarged, which promotes domain-invariant representations to contain more domain-irrelevant information. In the experiment, we separately evaluate our method on the single- and compound-target case. For the single-target case, experimental results of four domain-shift scenes show our method obtains a significant performance gain over baseline methods. Moreover, for the compound-target case (i.e., the target is a compound of two different domains without domain labels), our method outperforms baseline methods by around 4%, which demonstrates the effectiveness of our method.
引用
收藏
页码:9322 / 9331
页数:10
相关论文
共 46 条
[1]  
[Anonymous], 2017, P 2017 ACM MULT C, DOI DOI 10.1145/3123266.3123292
[2]  
[Anonymous], 2017, P IEEE INT C COMPUTE
[3]   Representation Learning: A Review and New Perspectives [J].
Bengio, Yoshua ;
Courville, Aaron ;
Vincent, Pascal .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828
[4]  
Cai RC, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P2060
[5]  
Chen C., 2020, P IEEE CVF C COMP VI, P8869
[6]   Domain Adaptive Faster R-CNN for Object Detection in the Wild [J].
Chen, Yuhua ;
Li, Wen ;
Sakaridis, Christos ;
Dai, Dengxin ;
Van Gool, Luc .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3339-3348
[7]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[8]   A Risk Assessment Method for Enterprise Cloud Accounting [J].
Deng, Guohua ;
Xu, Chang .
2019 12TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID 2019), 2019, :172-175
[9]  
Do Kien, 2019, ARXIV190809961
[10]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338