Swin-Transformer-Enabled YOLOv5 with Attention Mechanism for Small Object Detection on Satellite Images

被引：119

作者：

Gong, Hang ^{[1
]}

Mu, Tingkui ^{[1
]}

Li, Qiuxia ^{[1
]}

Dai, Haishan ^{[2
]}

Li, Chunlai ^{[3
]}

He, Zhiping ^{[3
]}

Wang, Wenjing ^{[1
]}

Han, Feng ^{[1
]}

Tuniyazi, Abudusalamu ^{[1
]}

Li, Haoyang ^{[1
]}

Lang, Xuechan ^{[1
]}

Li, Zhiyuan ^{[1
]}

Wang, Bin ^{[1
]}

机构：

[1] Xi An Jiao Tong Univ, Res Ctr Space Opt & Astron, Sch Phys, MOE Key Lab Nonequilibrium Synth & Modulat Conden, Xian 710049, Peoples R China

[2] Shanghai Acad Spaceflight Technol, Shanghai Inst Satellite Engn, Shanghai 201109, Peoples R China

[3] Chinese Acad Sci, Shanghai Inst Tech Phys, Shanghai 200083, Peoples R China

来源：

REMOTE SENSING | 2022年 / 14卷 / 12期

基金：

中国国家自然科学基金;

关键词：

satellite images; object detection; self-attention mechanism; Swin transformer; deep learning; CLASSIFICATION;

D O I：

10.3390/rs14122861

中图分类号：

X [环境科学、安全科学];

学科分类号：

08 ; 0830 ;

摘要：

Object detection has made tremendous progress in natural images over the last decade. However, the results are hardly satisfactory when the natural image object detection algorithm is directly applied to satellite images. This is due to the intrinsic differences in the scale and orientation of objects generated by the bird's-eye perspective of satellite photographs. Moreover, the background of satellite images is complex and the object area is small; as a result, small objects tend to be missing due to the challenge of feature extraction. Dense objects overlap and occlusion also affects the detection performance. Although the self-attention mechanism was introduced to detect small objects, the computational complexity increased with the image's resolution. We modified the general one-stage detector YOLOv5 to adapt the satellite images to resolve the above problems. First, new feature fusion layers and a prediction head are added from the shallow layer for small object detection for the first time because it can maximally preserve the feature information. Second, the original convolutional prediction heads are replaced with Swin Transformer Prediction Heads (SPHs) for the first time. SPH represents an advanced self-attention mechanism whose shifted window design can reduce the computational complexity to linearity. Finally, Normalization-based Attention Modules (NAMs) are integrated into YOLOv5 to improve attention performance in a normalized way. The improved YOLOv5 is termed SPH-YOLOv5. It is evaluated on the NWPU-VHR10 dataset and DOTA dataset, which are widely used for satellite image object detection evaluations. Compared with the basal YOLOv5, SPH-YOLOv5 improves the mean Average Precision (mAP) by 0.071 on the DOTA dataset.

引用

页数：17

共 50 条

[21] Improved Lightweight YOLOv5 Using Attention Mechanism for Satellite Components Recognition
Li, Cong
Zhao, Gaopeng
Gu, Dongqing
Wang, Zebin
IEEE SENSORS JOURNAL, 2023, 23 (01) : 514 - 526
[22] Driver Attention Detection Based on Improved YOLOv5
Wang, Zhongzhou
Yao, Keming
Guo, Fuao
APPLIED SCIENCES-BASEL, 2023, 13 (11):
[23] SF-YOLOv5: Improved YOLOv5 with swin transformer and fusion-concat method for multi-UAV detection
Ma, Jun
Wang, Xiao
Xu, Cuifeng
Ling, Jing
MEASUREMENT & CONTROL, 2023, 56 (7-8) : 1436 - 1445
[24] Improved YOLOv7 models based on modulated deformable convolution and swin transformer for object detection in fisheye images
Zhou, Jie
Yang, Degang
Song, Tingting
Ye, Yichen
Zhang, Xin
Song, Yingze
IMAGE AND VISION COMPUTING, 2024, 144
[25] Application of YOLOv5 Based on Attention Mechanism and Receptive Field in Identifying Defects of Thangka Images
Li, Yubo
Fan, Yao
Wang, Shuaishuai
Bai, Jianxian
Li, Keying
IEEE ACCESS, 2022, 10 : 81597 - 81611
[26] An Improved YOLOv5 Crack Detection Method Combined With Transformer
Xiang, Xuezhi
Wang, Zhiyuan
Qiao, Yulong
IEEE SENSORS JOURNAL, 2022, 22 (14) : 14328 - 14335
[27] A YOLOv5 Baseline for Underwater Object Detection
Wang, Hao
Sun, Shixin
Wu, Xiaohui
Li, Li
Zhang, Hao
Li, Mingjie
Ren, Peng
OCEANS 2021: SAN DIEGO - PORTO, 2021,
[28] An improved YOLOv5 for object detection in visible and thermal infrared images based on contrastive learning
Tu, Xiaoguang
Yuan, Zihao
Liu, Bokai
Liu, Jianhua
Hu, Yan
Hua, Houqiang
Wei, Lin
FRONTIERS IN PHYSICS, 2023, 11
[29] An Improved YOLOv5s Algorithm for Object Detection with an Attention Mechanism
Jiang, Tingyao
Li, Cheng
Yang, Ming
Wang, Zilong
ELECTRONICS, 2022, 11 (16)
[30] Object Detection of Individual Mangrove Based on Improved YOLOv5
Ma Yongkang
Liu Hua
Ling Chengxing
Zhao Feng
Jiang Yi
Zhang Yutong
LASER & OPTOELECTRONICS PROGRESS, 2022, 59 (18)

← 1 2 3 4 5 →