Small object detection model for UAV aerial image based on YOLOv7

被引:0
作者
Jinguang Chen
Ronghui Wen
Lili Ma
机构
[1] Xi’an Polytechnic University,The Shaanxi Key Laboratory of Clothing Intelligence, School of Computer Science
来源
Signal, Image and Video Processing | 2024年 / 18卷
关键词
UAV image detection; Small object detection; YOLOv7; Swin transformer; Detection head;
D O I
暂无
中图分类号
学科分类号
摘要
Unmanned Aerial Vehicle (UAV) aerial image target detection mainly faces the problems of small targets and target occlusion. In order to improve detection accuracy while maintaining efficiency, this work introduces a UAV aerial image small object detection model based on the real-time detector YOLOv7(SOD-YOLOv7). To address the challenge of small object detection, we have designed a module that combines Swin Transformer and convolution to better capture the global context information of small objects in the image. Additionally, we have introduced the Bi-Level Routing Attention (BRA) mechanism to enhance the model's focus on small objects. To improve the model's detection capabilities at multiple scales, we have added detection branches. For the issue of detecting occluded objects, we have incorporated a dynamic detection head with deformable convolution and attention mechanisms to enhance the model's spatial awareness of targets. The experimental results on the VisDrone and CARPK unmanned aerial vehicle image datasets show that the average precision (mAP@0.5) of our model reaches 53.2% and 98.5%, respectively. Compared to the original YOLOv7 method, our model achieves an improvement of 4.3% and 0.3%, demonstrating better performance in detecting small objects. The code will be soon released at https://github.com/Gentle-Hui/SOD-YOLOv7.
引用
收藏
页码:2695 / 2707
页数:12
相关论文
共 42 条
  • [1] Ren S(2017)Faster R-CNN: towards real-time object detection with region proposal networks IEEE Trans. Pattern Anal. Mach. Intell. 39 1137-1149
  • [2] He K(2022)A unified deep learning framework of multi-scale detectors for geo-spatial object detection in high-resolution satellite images Arab. J. Sci. Eng. 47 9489-9504
  • [3] Girshick R(2022)SSPNet: scale selection pyramid network for tiny person detection from UAV images IEEE Geosci. Remote Sens. Lett. 19 1-5
  • [4] Khan SD(2021)TWC-Net: a SAR ship detection using two-way convolution and multiscale feature mapping Remote Sens. 13 2558-105030
  • [5] Alarabi L(2023)Enhanced semantic feature pyramid network for small object detection Signal Process. Image Commun. 113 116919-6010
  • [6] Basalamah S(2018)Deformable Faster R-CNN with aggregating multi-layer features for partially occluded object detection in optical remote sensing images Remote Sens. 10 1470-13949
  • [7] Hong M(2022)Ganster R-CNN: occluded object detection network based on generative adversarial nets and faster R-CNN IEEE Access 10 105022-undefined
  • [8] Li S(2023)An efficient SMD-PCBA detection based on YOLOv7 network model Eng. Appl. Artif. Intell. 124 106492-undefined
  • [9] Yang Y(2023)Underwater-YCC: underwater object detection optimization algorithm based on YOLOv7 J. Mar. Sci. Eng. 11 995-undefined
  • [10] Yu L(2017)Attention is all you need Neural Inf. Process. Syst. 30 6000-undefined