An Efficient UAV Image Object Detection Algorithm Based on Global Attention and Multi-Scale Feature Fusion

被引:1
作者
Qian, Rui [1 ]
Ding, Yong [2 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore
[2] Nanjing Univ Aeronaut & Astronaut, Coll Automat Engn, Nanjing 210016, Peoples R China
关键词
UAV; object detection; global attention; feature fusion;
D O I
10.3390/electronics13203989
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Object detection technology holds significant promise in unmanned aerial vehicle (UAV) applications. However, traditional methods face challenges in detecting denser, smaller, and more complex targets within UAV aerial images. To address issues such as target occlusion and dense small objects, this paper proposes a multi-scale object detection algorithm based on YOLOv5s. A novel feature extraction module, DCNCSPELAN4, which combines CSPNet and ELAN, is introduced to enhance the receptive field of feature extraction while maintaining network efficiency. Additionally, a lightweight Vision Transformer module, the CloFormer Block, is integrated to provide the network with a global receptive field. Moreover, the algorithm incorporates a three-scale feature fusion (TFE) module and a scale sequence feature fusion (SSFF) module in the neck network to effectively leverage multi-scale spatial information across different feature maps. To address dense small objects, an additional small object detection head was added to the detection layer. The original large object detection head was removed to reduce computational load. The proposed algorithm has been evaluated through ablation experiments and compared with other state-of-the-art methods on the VisDrone2019 and AU-AIR datasets. The results demonstrate that our algorithm outperforms other baseline methods in terms of both accuracy and speed. Compared to the YOLOv5s baseline model, the enhanced algorithm achieves improvements of 12.4% and 8.4% in AP50 and AP metrics, respectively, with only a marginal parameter increase of 0.3 M. These experiments validate the effectiveness of our algorithm for object detection in drone imagery.
引用
收藏
页数:20
相关论文
共 52 条
  • [1] UCGNet: wireless sensor network-based active aquifer contamination monitoring and control system for underground coal gasification
    Barnwal, Rajesh P.
    Bharti, Sujeet
    Misra, Sudip
    Obaidat, Mohammad S.
    [J]. INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, 2017, 30 (01)
  • [2] Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, DOI 10.48550/ARXIV.2004.10934, 10.48550/arXiv.2004.10934]
  • [3] Bozcan I, 2020, IEEE INT CONF ROBOT, P8504, DOI [10.1109/ICRA40945.2020.9196845, 10.1109/icra40945.2020.9196845]
  • [4] Cascade R-CNN: Delving into High Quality Object Detection
    Cai, Zhaowei
    Vasconcelos, Nuno
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6154 - 6162
  • [5] mSODANet: A network for multi-scale object detection in aerial images using hierarchical dilated convolutions *
    Chalavadi, Vishnu
    Jeripothula, Prudviraj
    Datla, Rajeshreddy
    Babu, Sobhan Ch
    Mohan, Krishna C.
    [J]. PATTERN RECOGNITION, 2022, 126
  • [6] Xception: Deep Learning with Depthwise Separable Convolutions
    Chollet, Francois
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1800 - 1807
  • [7] The Cityscapes Dataset for Semantic Urban Scene Understanding
    Cordts, Marius
    Omran, Mohamed
    Ramos, Sebastian
    Rehfeld, Timo
    Enzweiler, Markus
    Benenson, Rodrigo
    Franke, Uwe
    Roth, Stefan
    Schiele, Bernt
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
  • [8] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [9] VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results
    Du, Dawei
    Zhu, Pengfei
    Wen, Longyin
    Bian, Xiao
    Ling, Haibin
    Hu, Qinghua
    Peng, Tao
    Zheng, Jiayu
    Wang, Xinyao
    Zhang, Yue
    Bo, Liefeng
    Shi, Hailin
    Zhu, Rui
    Kumar, Aashish
    Li, Aijin
    Zinollayev, Almaz
    Askergaliyev, Anuar
    Schumann, Arne
    Mao, Binjie
    Lee, Byeongwon
    Liu, Chang
    Chen, Changrui
    Pan, Chunhong
    Huo, Chunlei
    Yu, Da
    Cong, Dechun
    Zeng, Dening
    Pailla, Dheeraj Reddy
    Li, Di
    Wang, Dong
    Cho, Donghyeon
    Zhang, Dongyu
    Bai, Furui
    Jose, George
    Gao, Guangyu
    Liu, Guizhong
    Xiong, Haitao
    Qi, Hao
    Wang, Haoran
    Qiu, Heqian
    Li, Hongliang
    Lu, Huchuan
    Kim, Ildoo
    Kim, Jaekyum
    Shen, Jane
    Lee, Jihoon
    Ge, Jing
    Xu, Jingjing
    Zhou, Jingkai
    Meier, Jonas
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 213 - 226
  • [10] StrongSORT: Make DeepSORT Great Again
    Du, Yunhao
    Zhao, Zhicheng
    Song, Yang
    Zhao, Yanyun
    Su, Fei
    Gong, Tao
    Meng, Hongying
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8725 - 8737