SRE-YOLOv8: An Improved UAV Object Detection Model Utilizing Swin Transformer and RE-FPN

被引:2
作者
Li, Jun [1 ,2 ]
Zhang, Jiajie [1 ,2 ]
Shao, Yanhua [3 ]
Liu, Feng [4 ,5 ]
机构
[1] Beijing Informat Sci & Technol Univ, Artificial Intelligence Secur Innovat Res, Beijing 100192, Peoples R China
[2] Beijing Informat Sci & Technol Univ, Dept Informat Secur, Beijing 100192, Peoples R China
[3] Natl Comp Syst Engn Res Inst China, Beijing 100083, Peoples R China
[4] East China Normal Univ, Sch Comp Sci & Technol, Shanghai 200062, Peoples R China
[5] East China Normal Univ, Shanghai Int Sch Chief Technol Officer, Shanghai 200062, Peoples R China
关键词
deep learning; object detection; YOLOv8; Swin Transformer; feature pyramid network; computational perception; FRAMEWORK;
D O I
10.3390/s24123918
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
To tackle the intricate challenges associated with the low detection accuracy of images taken by unmanned aerial vehicles (UAVs), arising from the diverse sizes and types of objects coupled with limited feature information, we present the SRE-YOLOv8 as an advanced method. Our method enhances the YOLOv8 object detection algorithm by leveraging the Swin Transformer and a lightweight residual feature pyramid network (RE-FPN) structure. Firstly, we introduce an optimized Swin Transformer module into the backbone network to preserve ample global contextual information during feature extraction and to extract a broader spectrum of features using self-attention mechanisms. Subsequently, we integrate a Residual Feature Augmentation (RFA) module and a lightweight attention mechanism named ECA, thereby transforming the original FPN structure to RE-FPN, intensifying the network's emphasis on critical features. Additionally, an SOD (small object detection) layer is incorporated to enhance the network's ability to recognize the spatial information of the model, thus augmenting accuracy in detecting small objects. Finally, we employ a Dynamic Head equipped with multiple attention mechanisms in the object detection head to enhance its performance in identifying low-resolution targets amidst complex backgrounds. Experimental evaluation conducted on the VisDrone2021 dataset reveals a significant advancement, showcasing an impressive 9.2% enhancement over the original YOLOv8 algorithm.
引用
收藏
页数:20
相关论文
共 52 条
  • [21] DAF: An adaptive computing framework for multimedia data streams analysis
    Li, Jun
    Li, Chao
    Tian, Bin
    Liu, Yanzhao
    Si, Chengxiang
    [J]. INTELLIGENT DATA ANALYSIS, 2020, 24 (06) : 1441 - 1453
  • [22] [李晓艳 Li Xiaoyan], 2022, [西安交通大学学报, Journal of Xi'an Jiaotong University], V56, P61
  • [23] Li Y., 2019, P 2019 IEEE CVF INT
  • [24] Scale-Aware Trident Networks for Object Detection
    Li, Yanghao
    Chen, Yuntao
    Wang, Naiyan
    Zhang, Zhaoxiang
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6053 - 6062
  • [25] Microsoft COCO: Common Objects in Context
    Lin, Tsung-Yi
    Maire, Michael
    Belongie, Serge
    Hays, James
    Perona, Pietro
    Ramanan, Deva
    Dollar, Piotr
    Zitnick, C. Lawrence
    [J]. COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 : 740 - 755
  • [26] SSD: Single Shot MultiBox Detector
    Liu, Wei
    Anguelov, Dragomir
    Erhan, Dumitru
    Szegedy, Christian
    Reed, Scott
    Fu, Cheng-Yang
    Berg, Alexander C.
    [J]. COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 : 21 - 37
  • [27] Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
    Liu, Ze
    Lin, Yutong
    Cao, Yue
    Hu, Han
    Wei, Yixuan
    Zhang, Zheng
    Lin, Stephen
    Guo, Baining
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9992 - 10002
  • [28] An Improved YOLOv5 Method for Small Object Detection in UAV Capture Scenes
    Liu, Zhen
    Gao, Xuehui
    Wan, Yu
    Wang, Jianhao
    Lyu, Hao
    [J]. IEEE ACCESS, 2023, 11 : 14365 - 14374
  • [29] Distinctive image features from scale-invariant keypoints
    Lowe, DG
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2004, 60 (02) : 91 - 110
  • [30] Enhanced Single Shot Small Object Detector for Aerial Imagery Using Super-Resolution, Feature Fusion and Deconvolution
    Maktab Dar Oghaz, Mahdi
    Razaak, Manzoor
    Remagnino, Paolo
    [J]. SENSORS, 2022, 22 (12)