Unsupervised Cross-domain Object Detection Based on Progressive Multi-source Transfer

被引：0

作者：

Li W. ^{[1
,2
]}

Wang M. ^{[1
,2
]}

机构：

[1] School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming

[2] Yunnan Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming

来源：

Zidonghua Xuebao/Acta Automatica Sinica | 2022年 / 48卷 / 09期

基金：

中国国家自然科学基金;

关键词：

domain adaptation; multi-source domain; object detection; self training; Transfer learning;

D O I：

10.16383/j.aas.c190532

中图分类号：

学科分类号：

摘要：

To address the difficulty of collecting manually labeled training samples for object detection tasks, this paper proposes an unsupervised cross-domain object detection method that gradually adapts the model at pixel level and feature level. The existing pixel-level domain adaptive methods generate translated images with a single style and inconsistent content structure. To solve this problem, this paper embeds the input images into domain-invariant content space and domain-specific attribute space, then cooperates different space representations to synthesize diverse translated images that preserve the spatial semantic information to enable label transfer. In addition, for feature-level domain adaptation, to alleviate the source-bias problem caused by single source domain, we treat the generated diverse labeled images as source domain data and design a multi-domain discriminator to get multi-domain-invariant representations. Finally, To further enhance the detection performance on the target domain, we propose a self-training framework to alternatively generate pseudo labels on target training data. The exploratory experiment results from the Cityscapes & Foggy Cityscapes dataset and VOC07 & Clipart1k dataset demonstrate that compared with the current unsupervised cross-domain detection methods, the proposed detection framework achieves better transferability. © 2022 Science Press. All rights reserved.

引用

页码：2337 / 2351

页数：14

共 37 条

[31]

Lee H Y, Tseng H Y, Huang J B, Singh M, Yang M H., Diverse image-to-image translation via disentangled representations, Proceedings of the Proceedings of the 2018 European Conference on Computer Vision, pp. 36-52, (2018)

[32]

Ulyanov D, Vedaldi D, Lempitsky V., Improved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis, Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 4105-4113, (2017)

[33]

Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, La-violette F, Et al., Domain-adversarial training of neural networks, Proceedings of the 2017 Domain Adaptation in Computer Vision Applications, pp. 189-209, (2017)

[34]

Zhao H, Zhang S H, Wu G H, Moura J, Costeira J P, Gordon G J., Adversarial multiple source domain adaptation, Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 8568-8579, (2018)

[35]

Ioffe S, Szegedy C., Batch normalization: Accelerating deep network training by reducing internal covariate shift, Proceedings of the 32nd International Conference on Machine Learning, pp. 448-456, (2015)

[36]

Everingham M, Gool L J, Williams C, Winn J, Zisserman A., Semantic the pascal visual object classes (VOC) Challenge, International Journal of Computer Vision, 88, 2, pp. 303-338, (2010)

[37]

Diederik P K, Jimmy B., Adam: A method for stochastic optimization, (2014)

← 1 2 3 4 →