Unsupervised Cross-domain Object Detection Based on Progressive Multi-source Transfer

被引:0
作者
Li W. [1 ,2 ]
Wang M. [1 ,2 ]
机构
[1] School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming
[2] Yunnan Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming
来源
Zidonghua Xuebao/Acta Automatica Sinica | 2022年 / 48卷 / 09期
基金
中国国家自然科学基金;
关键词
domain adaptation; multi-source domain; object detection; self training; Transfer learning;
D O I
10.16383/j.aas.c190532
中图分类号
学科分类号
摘要
To address the difficulty of collecting manually labeled training samples for object detection tasks, this paper proposes an unsupervised cross-domain object detection method that gradually adapts the model at pixel level and feature level. The existing pixel-level domain adaptive methods generate translated images with a single style and inconsistent content structure. To solve this problem, this paper embeds the input images into domain-invariant content space and domain-specific attribute space, then cooperates different space representations to synthesize diverse translated images that preserve the spatial semantic information to enable label transfer. In addition, for feature-level domain adaptation, to alleviate the source-bias problem caused by single source domain, we treat the generated diverse labeled images as source domain data and design a multi-domain discriminator to get multi-domain-invariant representations. Finally, To further enhance the detection performance on the target domain, we propose a self-training framework to alternatively generate pseudo labels on target training data. The exploratory experiment results from the Cityscapes & Foggy Cityscapes dataset and VOC07 & Clipart1k dataset demonstrate that compared with the current unsupervised cross-domain detection methods, the proposed detection framework achieves better transferability. © 2022 Science Press. All rights reserved.
引用
收藏
页码:2337 / 2351
页数:14
相关论文
共 37 条
[31]  
Lee H Y, Tseng H Y, Huang J B, Singh M, Yang M H., Diverse image-to-image translation via disentangled representations, Proceedings of the Proceedings of the 2018 European Conference on Computer Vision, pp. 36-52, (2018)
[32]  
Ulyanov D, Vedaldi D, Lempitsky V., Improved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis, Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 4105-4113, (2017)
[33]  
Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, La-violette F, Et al., Domain-adversarial training of neural networks, Proceedings of the 2017 Domain Adaptation in Computer Vision Applications, pp. 189-209, (2017)
[34]  
Zhao H, Zhang S H, Wu G H, Moura J, Costeira J P, Gordon G J., Adversarial multiple source domain adaptation, Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 8568-8579, (2018)
[35]  
Ioffe S, Szegedy C., Batch normalization: Accelerating deep network training by reducing internal covariate shift, Proceedings of the 32nd International Conference on Machine Learning, pp. 448-456, (2015)
[36]  
Everingham M, Gool L J, Williams C, Winn J, Zisserman A., Semantic the pascal visual object classes (VOC) Challenge, International Journal of Computer Vision, 88, 2, pp. 303-338, (2010)
[37]  
Diederik P K, Jimmy B., Adam: A method for stochastic optimization, (2014)