SRE-YOLOv8: An Improved UAV Object Detection Model Utilizing Swin Transformer and RE-FPN

被引：2

作者：

Li, Jun ^{[1
,2
]}

Zhang, Jiajie ^{[1
,2
]}

Shao, Yanhua ^{[3
]}

Liu, Feng ^{[4
,5
]}

机构：

[1] Beijing Informat Sci & Technol Univ, Artificial Intelligence Secur Innovat Res, Beijing 100192, Peoples R China

[2] Beijing Informat Sci & Technol Univ, Dept Informat Secur, Beijing 100192, Peoples R China

[3] Natl Comp Syst Engn Res Inst China, Beijing 100083, Peoples R China

[4] East China Normal Univ, Sch Comp Sci & Technol, Shanghai 200062, Peoples R China

[5] East China Normal Univ, Shanghai Int Sch Chief Technol Officer, Shanghai 200062, Peoples R China

来源：

SENSORS | 2024年 / 24卷 / 12期

关键词：

deep learning; object detection; YOLOv8; Swin Transformer; feature pyramid network; computational perception; FRAMEWORK;

D O I：

10.3390/s24123918

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

To tackle the intricate challenges associated with the low detection accuracy of images taken by unmanned aerial vehicles (UAVs), arising from the diverse sizes and types of objects coupled with limited feature information, we present the SRE-YOLOv8 as an advanced method. Our method enhances the YOLOv8 object detection algorithm by leveraging the Swin Transformer and a lightweight residual feature pyramid network (RE-FPN) structure. Firstly, we introduce an optimized Swin Transformer module into the backbone network to preserve ample global contextual information during feature extraction and to extract a broader spectrum of features using self-attention mechanisms. Subsequently, we integrate a Residual Feature Augmentation (RFA) module and a lightweight attention mechanism named ECA, thereby transforming the original FPN structure to RE-FPN, intensifying the network's emphasis on critical features. Additionally, an SOD (small object detection) layer is incorporated to enhance the network's ability to recognize the spatial information of the model, thus augmenting accuracy in detecting small objects. Finally, we employ a Dynamic Head equipped with multiple attention mechanisms in the object detection head to enhance its performance in identifying low-resolution targets amidst complex backgrounds. Experimental evaluation conducted on the VisDrone2021 dataset reveals a significant advancement, showcasing an impressive 9.2% enhancement over the original YOLOv8 algorithm.

引用

页数：20

共 52 条

[21] DAF: An adaptive computing framework for multimedia data streams analysis
Li, Jun
Li, Chao
Tian, Bin
Liu, Yanzhao
Si, Chengxiang
[J]. INTELLIGENT DATA ANALYSIS, 2020, 24 (06) : 1441 - 1453
[22] [李晓艳 Li Xiaoyan], 2022, [西安交通大学学报, Journal of Xi'an Jiaotong University], V56, P61
[23] Li Y., 2019, P 2019 IEEE CVF INT
[24] Scale-Aware Trident Networks for Object Detection
Li, Yanghao
Chen, Yuntao
Wang, Naiyan
Zhang, Zhaoxiang
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6053 - 6062
[25] Microsoft COCO: Common Objects in Context
Lin, Tsung-Yi
Maire, Michael
Belongie, Serge
Hays, James
Perona, Pietro
Ramanan, Deva
Dollar, Piotr
Zitnick, C. Lawrence
[J]. COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 : 740 - 755
[26] SSD: Single Shot MultiBox Detector
Liu, Wei
Anguelov, Dragomir
Erhan, Dumitru
Szegedy, Christian
Reed, Scott
Fu, Cheng-Yang
Berg, Alexander C.
[J]. COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 : 21 - 37
[27] Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Liu, Ze
Lin, Yutong
Cao, Yue
Hu, Han
Wei, Yixuan
Zhang, Zheng
Lin, Stephen
Guo, Baining
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9992 - 10002
[28] An Improved YOLOv5 Method for Small Object Detection in UAV Capture Scenes
Liu, Zhen
Gao, Xuehui
Wan, Yu
Wang, Jianhao
Lyu, Hao
[J]. IEEE ACCESS, 2023, 11 : 14365 - 14374
[29] Distinctive image features from scale-invariant keypoints
Lowe, DG
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2004, 60 (02) : 91 - 110
[30] Enhanced Single Shot Small Object Detector for Aerial Imagery Using Super-Resolution, Feature Fusion and Deconvolution
Maktab Dar Oghaz, Mahdi
Razaak, Manzoor
Remagnino, Paolo
[J]. SENSORS, 2022, 22 (12)

← 1 2 3 4 5 6 →