SSDA-YOLO: Semi-supervised domain adaptive YOLO for cross-domain object detection

被引:60
作者
Zhou, Huayi [1 ]
Jiang, Fei [2 ]
Lu, Hongtao [1 ,3 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
[2] East China Normal Univ, Shanghai Inst AI Educ, Shanghai 200062, Peoples R China
[3] Shanghai Jiao Tong Univ, AI Inst, MOE Key Lab Artificial Intelligence, Shanghai, Peoples R China
关键词
Domain adaptation; Knowledge distillation; Semi-supervised; YOLO;
D O I
10.1016/j.cviu.2023.103649
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Domain adaptive object detection (DAOD) aims to alleviate transfer performance degradation caused by the cross-domain discrepancy. However, most existing DAOD methods are dominated by outdated and computationally intensive two-stage Faster R-CNN, which is not the first choice for industrial applications. In this paper, we propose a novel semi-supervised domain adaptive YOLO (SSDA-YOLO) based method to improve cross-domain detection performance by integrating the compact one-stage stronger detector YOLOv5 with domain adaptation. Specifically, we adapt the knowledge distillation framework with the Mean Teacher model to assist the student model in obtaining instance-level features of the unlabeled target domain. We also utilize the scene style transfer to cross-generate pseudo images in different domains for remedying image-level differences. In addition, an intuitive consistency loss is proposed to further align cross-domain predictions. We evaluate SSDA-YOLO on public benchmarks including PascalVOC, Clipart1k, Cityscapes, and Foggy Cityscapes. Moreover, to verify its generalization, we conduct experiments on yawning detection datasets collected from various real classrooms. The results show considerable improvements of our method in these DAOD tasks, which reveals both the effectiveness of proposed adaptive modules and the urgency of applying more advanced detectors in DAOD. Our code is available on https://github.com/hnuzhy/SSDA-YOLO.
引用
收藏
页数:9
相关论文
共 67 条
[1]  
[Anonymous], 2010, International journal of computer vision, DOI DOI 10.1007/s11263-009-0275-4
[2]   Understanding Robustness of Transformers for Image Classification [J].
Bhojanapalli, Srinadh ;
Chakrabarti, Ayan ;
Glasner, Daniel ;
Li, Daliang ;
Unterthiner, Thomas ;
Veit, Andreas .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :10211-10221
[3]  
Bochkovskiy A, 2020, Arxiv, DOI arXiv:2004.10934
[4]   Exploring Object Relation in Mean Teacher for Cross-Domain Detection [J].
Cai, Qi ;
Pan, Yingwei ;
Ngo, Chong-Wah ;
Tian, Xinmei ;
Duan, Lingyu ;
Yao, Ting .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :11449-11458
[5]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[6]  
Chen C., 2021, P IEEE C COMPUTER VI, P12576
[7]   Harmonizing Transferability and Discriminability for Adapting Object Detectors [J].
Chen, Chaoqi ;
Zheng, Zebiao ;
Ding, Xinghao ;
Huang, Yue ;
Dou, Qi .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :8866-8875
[8]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[9]  
Chen M., 2022, ICML
[10]   Domain Adaptive Faster R-CNN for Object Detection in the Wild [J].
Chen, Yuhua ;
Li, Wen ;
Sakaridis, Christos ;
Dai, Dengxin ;
Van Gool, Luc .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3339-3348