Small object detection model for UAV aerial image based on YOLOv7

被引:4
作者
Chen, Jinguang [1 ]
Wen, Ronghui [1 ]
Ma, Lili [1 ]
机构
[1] Xian Polytech Univ, Sch Comp Sci, Shaanxi Key Lab Clothing Intelligence, Xian 710048, Peoples R China
关键词
UAV image detection; Small object detection; YOLOv7; Swin transformer; Detection head;
D O I
10.1007/s11760-023-02941-0
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Unmanned Aerial Vehicle (UAV) aerial image target detection mainly faces the problems of small targets and target occlusion. In order to improve detection accuracy while maintaining efficiency, this work introduces a UAV aerial image small object detection model based on the real-time detector YOLOv7(SOD-YOLOv7). To address the challenge of small object detection, we have designed a module that combines Swin Transformer and convolution to better capture the global context information of small objects in the image. Additionally, we have introduced the Bi-Level Routing Attention (BRA) mechanism to enhance the model's focus on small objects. To improve the model's detection capabilities at multiple scales, we have added detection branches. For the issue of detecting occluded objects, we have incorporated a dynamic detection head with deformable convolution and attention mechanisms to enhance the model's spatial awareness of targets. The experimental results on the VisDrone and CARPK unmanned aerial vehicle image datasets show that the average precision (mAP@0.5) of our model reaches 53.2% and 98.5%, respectively. Compared to the original YOLOv7 method, our model achieves an improvement of 4.3% and 0.3%, demonstrating better performance in detecting small objects. The code will be soon released at https://github.com/Gentle-Hui/SOD-YOLOv7.
引用
收藏
页码:2695 / 2707
页数:13
相关论文
共 42 条
  • [1] Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, DOI 10.48550/ARXIV.1409.0473]
  • [2] Carion N., 2020, EUR C COMP VIS, P213, DOI [10.48550/arXiv.2005.12872, DOI 10.1007/978-3-030-58452-813]
  • [3] Underwater-YCC: Underwater Target Detection Optimization Algorithm Based on YOLOv7
    Chen, Xiao
    Yuan, Mujiahui
    Yang, Qi
    Yao, Haiyang
    Wang, Haiyan
    [J]. JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2023, 11 (05)
  • [4] Enhanced semantic feature pyramid network for small object detection
    Chen, Yuqi
    Zhu, Xiangbin
    Li, Yonggang
    Wei, Yuanwang
    Ye, Lihua
    [J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 113
  • [5] Dynamic Head: Unifying Object Detection Heads with Attentions
    Dai, Xiyang
    Chen, Yinpeng
    Xiao, Bin
    Chen, Dongdong
    Liu, Mengchen
    Yuan, Lu
    Zhang, Lei
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 7369 - 7378
  • [6] Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
    Ding, Xiaohan
    Zhang, Xiangyu
    Han, Jungong
    Ding, Guiguang
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11953 - 11965
  • [7] RepVGG: Making VGG-style ConvNets Great Again
    Ding, Xiaohan
    Zhang, Xiangyu
    Ma, Ningning
    Han, Jungong
    Ding, Guiguang
    Sun, Jian
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 13728 - 13737
  • [8] Dosovitskiy A., 2020, INT C LEARN REPR, DOI DOI 10.48550/ARXIV.2010.11929
  • [9] VisDrone-SOT2019: The Vision Meets Drone Single Object Tracking Challenge Results
    Du, Dawei
    Zhu, Pengfei
    Wen, Longyin
    Bian, Xiao
    Ling, Haibin
    Hu, Qinghua
    Zheng, Jiayu
    Peng, Tao
    Wang, Xinyao
    Zhang, Yue
    Bo, Liefeng
    Shi, Hailin
    Zhu, Rui
    Han, Bo
    Zhang, Chunhui
    Liu, Guizhong
    Wu, Han
    Wen, Hao
    Wang, Haoran
    Fan, Jiaqing
    Chen, Jie
    Gao, Jie
    Zhang, Jie
    Zhou, Jinghao
    Zhou, Jinliu
    Wang, Jinwang
    Wan, Jiuqing
    Kittler, Josef
    Zhang, Kaihua
    Huang, Kaiqi
    Yang, Kang
    Zhang, Kangkai
    Huang, Lianghua
    Zhou, Lijun
    Shi, Lingling
    Ding, Lu
    Wang, Ning
    Wang, Peng
    Hu, Qintao
    Laganiere, Robert
    Ma, Ruiyan
    Zhang, Ruohan
    Zou, Shanrong
    Zhao, Shengwei
    Li, Shengyang
    Zhu, Shengyin
    Li, Shikun
    Ge, Shiming
    Xuan, Shiyu
    Xu, Tianyang
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 199 - 212
  • [10] Ge Z., 2021, arXiv, V2107, DOI 10.48550/ARXIV.2107.08430