CAST-YOLO: An Improved YOLO Based on a Cross-Attention Strategy Transformer for Foggy Weather Adaptive Detection

被引:13
作者
Liu, Xinyi [1 ,2 ]
Zhang, Baofeng [1 ,2 ]
Liu, Na [2 ]
机构
[1] Tianjin Univ Technol, Sch Comp Sci & Engn, 391 Bin Shui Xi Dao Rd, Tianjin 300384, Peoples R China
[2] Tianjin Univ Technol, Tianjin Key Lab Control Theory & Applicat Complica, 391 Bin Shui Xi Dao Rd, Tianjin 300384, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 02期
关键词
computer vision; object detection; deep learning; domain adaptation;
D O I
10.3390/app13021176
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Both transformer and one-stage detectors have shown promising object detection results and have attracted increasing attention. However, the developments in effective domain adaptive techniques in transformer and one-stage detectors still have not been widely used. In this paper, we investigate this issue and propose a novel improved You Only Look Once (YOLO) model based on a cross-attention strategy transformer, called CAST-YOLO. This detector is a Teacher-Student knowledge transfer-based detector. We design a transformer encoder layer (TE-Layer) and a convolutional block attention module (CBAM) to capture global and rich contextual information. Then, the detector implements cross-domain object detection through the knowledge distillation method. Specifically, we propose a cross-attention strategy transformer to align domain-invariant features between the source and target domains. This strategy consists of three transformers with shared weights, identified as the source branch, target branch, and cross branch. The feature alignment uses knowledge distillation, to address better knowledge transfer from the source domain to the target domain. The above strategy provides better robustness for a model with noisy input. Extensive experiments show that our method outperforms the existing methods in foggy weather adaptive detection, significantly improving the detection results.
引用
收藏
页数:15
相关论文
共 29 条
[1]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[2]  
Chen C., 2021, P IEEE C COMPUTER VI, P12576
[3]   Domain Adaptive Faster R-CNN for Object Detection in the Wild [J].
Chen, Yuhua ;
Li, Wen ;
Sakaridis, Christos ;
Dai, Dengxin ;
Van Gool, Luc .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3339-3348
[4]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[5]   Unbiased Mean Teacher for Cross-domain Object Detection [J].
Deng, Jinhong ;
Li, Wen ;
Chen, Yuhua ;
Duan, Lixin .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :4089-4099
[6]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448
[7]   Rich feature hierarchies for accurate object detection and semantic segmentation [J].
Girshick, Ross ;
Donahue, Jeff ;
Darrell, Trevor ;
Malik, Jitendra .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587
[8]  
Glenn Jocher K., 2020, YOLOV5
[9]   MULTISCALE DOMAIN ADAPTIVE YOLO FOR CROSS-DOMAIN OBJECT DETECTION [J].
Hnewa, Mazin ;
Radha, Hayder .
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, :3323-3327
[10]   Every Pixel Matters: Center-Aware Feature Alignment for Domain Adaptive Object Detector [J].
Hsu, Cheng-Chun ;
Tsai, Yi-Hsuan ;
Lin, Yen-Yu ;
Yang, Ming-Hsuan .
COMPUTER VISION - ECCV 2020, PT IX, 2020, 12354 :733-748