Object Detection of Remote Sensing Image Based on Multi-Scale Feature Fusion and Attention Mechanism

被引:6
作者
Du, Zuoqiang [1 ]
Liang, Yuan [2 ]
机构
[1] Harbin Univ Commerce, Sch Comp & Informat Engn, Harbin 150028, Peoples R China
[2] Jinan Inspur Data Technol Co Ltd, Jinan 250000, Peoples R China
关键词
Deep Learning; object detection; remote sensing image; multi-scale feature fusion pyramid network; adaptive channel spatial attention mechanism; joint teacher knowledge distillation; CLASSIFICATION;
D O I
10.1109/ACCESS.2024.3352601
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In view of the small size and dense distribution of remote sensing image targets, this paper adds a detection head P2 specifically for small-scale targets on the basis of the three detection layers of the original YOLOv5 model, and involves the shallow high-resolution feature map in the subsequent multi-scale feature fusion module. The problem of losing the key feature information of the small-scale target in the process of multiple downsampling is effectively avoided. Firstly, an enhanced multi-scale feature fusion pyramid network DSI-FPN is designed. The FPN+PAN network is optimized by using DepthwiseSparable Convolution and Involution operators with fewer parameters and computations, as well as a spatial attention mechanism to generate feature graphs with richer information for network detection tasks. Secondly, we propose an adaptive channel spatial attention mechanism SCBAM, which introduces a self-attention mechanism into CBAM module to add non-local information to the interaction that originally had only local information, breaks the convolution kernel limit, expands the model receptive field, and improves the feature expression ability of the model. Thirdly, in order to solve the problem of insufficient computing power when deploying the target detector for equipment, we propose a network knowledge distillation framework for joint teachers based on the feature layer. The distillation loss of teacher is designed, and the trend of student online learning is adjusted dynamically by balancing the contributions of teacher network and truth value. The detection accuracy of the student network is obviously improved, and the parameters and model size of the network are effectively reduced. Finally, Comparing with other remote sensing image object detection methods, the experimental results show that the approach presented has better detection effect for small-scale targets of remote sensing images under different lighting conditions. The detection accuracy reached 43.9%, and 7.4% higher than that of the original model. After knowledge distillation, the model parameters are reduced to 1/3 of the original, and the detection accuracy is 40.2%.
引用
收藏
页码:8619 / 8632
页数:14
相关论文
共 45 条
  • [1] Architecture, Classification, and Applications of Contemporary Unmanned Aerial Vehicles
    Alghamdi, Yousef
    Munir, Arslan
    La, Hung Manh
    [J]. IEEE CONSUMER ELECTRONICS MAGAZINE, 2021, 10 (06) : 9 - 20
  • [2] Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, 10.48550/arXiv.2004.10934]
  • [3] A tutorial on Support Vector Machines for pattern recognition
    Burges, CJC
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) : 121 - 167
  • [4] T-YOLO: Tiny Vehicle Detection Based on YOLO and Multi-Scale Convolutional Neural Networks
    Carrasco, Daniel Padilla
    Rashwan, Hatem A.
    Garcia, Miguel Angel
    Puig, Domenec
    [J]. IEEE ACCESS, 2023, 11 : 22430 - 22440
  • [5] Hybrid Task Cascade for Instance Segmentation
    Chen, Kai
    Pang, Jiangmiao
    Wang, Jiaqi
    Xiong, Yu
    Li, Xiaoxiao
    Sun, Shuyang
    Feng, Wansen
    Liu, Ziwei
    Shi, Jianping
    Ouyang, Wanli
    Loy, Chen Change
    Lin, Dahua
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4969 - 4978
  • [6] Chen LC, 2017, Arxiv, DOI [arXiv:1706.05587, DOI 10.48550/ARXIV.1706.05587]
  • [7] MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features
    Chen, Liang-Chieh
    Hermans, Alexander
    Papandreou, George
    Schroff, Florian
    Wang, Peng
    Adam, Hartwig
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4013 - 4022
  • [8] Histograms of oriented gradients for human detection
    Dalal, N
    Triggs, B
    [J]. 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, : 886 - 893
  • [9] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [10] The PASCAL Visual Object Classes Challenge: A Retrospective
    Everingham, Mark
    Eslami, S. M. Ali
    Van Gool, Luc
    Williams, Christopher K. I.
    Winn, John
    Zisserman, Andrew
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) : 98 - 136