Swin-Transformer-Enabled YOLOv5 with Attention Mechanism for Small Object Detection on Satellite Images

被引:119
|
作者
Gong, Hang [1 ]
Mu, Tingkui [1 ]
Li, Qiuxia [1 ]
Dai, Haishan [2 ]
Li, Chunlai [3 ]
He, Zhiping [3 ]
Wang, Wenjing [1 ]
Han, Feng [1 ]
Tuniyazi, Abudusalamu [1 ]
Li, Haoyang [1 ]
Lang, Xuechan [1 ]
Li, Zhiyuan [1 ]
Wang, Bin [1 ]
机构
[1] Xi An Jiao Tong Univ, Res Ctr Space Opt & Astron, Sch Phys, MOE Key Lab Nonequilibrium Synth & Modulat Conden, Xian 710049, Peoples R China
[2] Shanghai Acad Spaceflight Technol, Shanghai Inst Satellite Engn, Shanghai 201109, Peoples R China
[3] Chinese Acad Sci, Shanghai Inst Tech Phys, Shanghai 200083, Peoples R China
基金
中国国家自然科学基金;
关键词
satellite images; object detection; self-attention mechanism; Swin transformer; deep learning; CLASSIFICATION;
D O I
10.3390/rs14122861
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Object detection has made tremendous progress in natural images over the last decade. However, the results are hardly satisfactory when the natural image object detection algorithm is directly applied to satellite images. This is due to the intrinsic differences in the scale and orientation of objects generated by the bird's-eye perspective of satellite photographs. Moreover, the background of satellite images is complex and the object area is small; as a result, small objects tend to be missing due to the challenge of feature extraction. Dense objects overlap and occlusion also affects the detection performance. Although the self-attention mechanism was introduced to detect small objects, the computational complexity increased with the image's resolution. We modified the general one-stage detector YOLOv5 to adapt the satellite images to resolve the above problems. First, new feature fusion layers and a prediction head are added from the shallow layer for small object detection for the first time because it can maximally preserve the feature information. Second, the original convolutional prediction heads are replaced with Swin Transformer Prediction Heads (SPHs) for the first time. SPH represents an advanced self-attention mechanism whose shifted window design can reduce the computational complexity to linearity. Finally, Normalization-based Attention Modules (NAMs) are integrated into YOLOv5 to improve attention performance in a normalized way. The improved YOLOv5 is termed SPH-YOLOv5. It is evaluated on the NWPU-VHR10 dataset and DOTA dataset, which are widely used for satellite image object detection evaluations. Compared with the basal YOLOv5, SPH-YOLOv5 improves the mean Average Precision (mAP) by 0.071 on the DOTA dataset.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] UAV-YOLOv5: A Swin-Transformer-Enabled Small Object Detection Model for Long-Range UAV Images
    Li J.
    Xie C.
    Wu S.
    Ren Y.
    Annals of Data Science, 2024, 11 (04) : 1109 - 1138
  • [2] DenseSPH-YOLOv5: An automated damage detection model based on DenseNet and Swin-Transformer prediction head-enabled YOLOv5 with attention mechanism
    Roy, Arunabha M.
    Bhaduri, Jayabrata
    ADVANCED ENGINEERING INFORMATICS, 2023, 56
  • [3] Swin-Transformer-Based YOLOv5 for Small-Object Detection in Remote Sensing Images
    Cao, Xuan
    Zhang, Yanwei
    Lang, Song
    Gong, Yan
    SENSORS, 2023, 23 (07)
  • [4] ST-CA YOLOv5: Improved YOLOv5 Based on Swin Transformer and Coordinate Attention for Surface Defect Detection
    Yang, Wen
    Wu, Hongjie
    Tang, Chenwei
    Lv, Jiancheng
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [5] Residual Spatial Reduced Transformer Based on YOLOv5 for UAV Images Object Detection
    Chen, Li
    Cang, Naimeng
    Zhang, Wenbo
    Zhang, Chan
    Zhang, Weidong
    Guo, Dongsheng
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2024, 38 (05)
  • [6] A Lightweight Modified YOLOv5 Network Using a Swin Transformer for Transmission-Line Foreign Object Detection
    Zhang, Dongsheng
    Zhang, Zhigang
    Zhao, Na
    Wang, Zhihai
    ELECTRONICS, 2023, 12 (18)
  • [7] Improving the Vehicle Small Object Detection Algorithm of Yolov5
    Liu, Yuanyuan
    Zhu, Jianlin
    Ma, Haili
    INTERNATIONAL JOURNAL OF ENGINEERING AND TECHNOLOGY INNOVATION, 2025, 15 (01) : 57 - 67
  • [8] An Image Object Detection Model Based on Mixed Attention Mechanism Optimized YOLOv5
    Sun, Guangming
    Wang, Shuo
    Xie, Jiangjian
    ELECTRONICS, 2023, 12 (07)
  • [9] Crack detection based on attention mechanism with YOLOv5
    Lan, Min-Li
    Yang, Dan
    Zhou, Shuang-Xi
    Ding, Yang
    ENGINEERING REPORTS, 2025, 7 (01)
  • [10] Improved YOLOv5 for Aerial Images Based on Attention Mechanism
    Li, Zebin
    Fan, Bangkui
    Xu, Yulong
    Sun, Renwu
    IEEE ACCESS, 2023, 11 : 96235 - 96241