Small Target-YOLOv5: Enhancing the Algorithm for Small Object Detection in Drone Aerial Imagery Based on YOLOv5

被引:16
作者
Zhou, Jiachen [1 ,2 ]
Su, Taoyong [1 ]
Li, Kewei [2 ]
Dai, Jiyang [2 ]
机构
[1] Nanchang HangKong Univ, Sch Gen Aviat, Nanchang 330063, Peoples R China
[2] Nanchang Hangkong Univ, Sch Informat Engn, Nanchang 330063, Jiangxi, Peoples R China
关键词
drone aerial imagery; feature fusion network; receptive field feature extraction module; dynamic object detection head; small objects;
D O I
10.3390/s24010134
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Object detection in drone aerial imagery has been a consistent focal point of research. Aerial images present more intricate backgrounds, greater variation in object scale, and a higher occurrence of small objects compared to standard images. Consequently, conventional object detection algorithms are often unsuitable for direct application in drone scenarios. To address these challenges, this study proposes a drone object detection algorithm model based on YOLOv5, named SMT-YOLOv5 (Small Target-YOLOv5). The enhancement strategy involves improving the feature fusion network by incorporating detection layers and implementing a weighted bidirectional feature pyramid network. Additionally, the introduction of the Combine Attention and Receptive Fields Block (CARFB) receptive field feature extraction module and DyHead dynamic target detection head aims to broaden the receptive field, mitigate information loss, and enhance perceptual capabilities in spatial, scale, and task domains. Experimental validation on the VisDrone2021 dataset confirms a significant improvement in the target detection accuracy of SMT-YOLOv5. Each improvement strategy yields effective results, raising the average precision by 12.4 percentage points compared to the original method. Detection improvements for large, medium, and small targets increase by 6.9%, 9.5%, and 7.7%, respectively, compared to the original method. Similarly, applying the same improvement strategies to the low-complexity YOLOv8n results in SMT-YOLOv8n, which is comparable in complexity to SMT-YOLOv5s. The results indicate that, relative to SMT-YOLOv8n, SMT-YOLOv5s achieves a 2.5 percentage point increase in average precision. Furthermore, comparative experiments with other enhancement methods demonstrate the effectiveness of the improvement strategies.
引用
收藏
页数:20
相关论文
共 32 条
[1]  
Breiman L, 1996, MACH LEARN, V24, P123, DOI 10.1007/BF00058655
[2]   VisDrone-DET2021: The Vision Meets Drone Object detection Challenge Results [J].
Cao, Yaru ;
He, Zhijian ;
Wang, Lujia ;
Wang, Wenguan ;
Yuan, Yixuan ;
Zhang, Dingwen ;
Zhang, Jinglin ;
Zhu, Pengfei ;
Van Gool, Luc ;
Han, Junwei ;
Hoi, Steven ;
Hu, Qinghua ;
Liu, Ming ;
Cheng, Chong ;
Liu, Fanfan ;
Cao, Guojin ;
Li, Guozhen ;
Wang, Hongkai ;
He, Jianye ;
Wan, Junfeng ;
Wan, Qi ;
Zhao, Qi ;
Lyu, Shuchang ;
Zhao, Wenzhe ;
Lu, Xiaoqiang ;
Zhu, Xingkui ;
Liu, Yingjie ;
Lv, Yixuan ;
Ma, Yujing ;
Yang, Yuting ;
Wang, Zhe ;
Xu, Zhenyu ;
Luo, Zhipeng ;
Zhang, Zhimin ;
Zhang, Zhiguang ;
Li, Zihao ;
Zhang, Zixiao .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, :2847-2854
[3]   Dynamic Head: Unifying Object Detection Heads with Attentions [J].
Dai, Xiyang ;
Chen, Yinpeng ;
Xiao, Bin ;
Chen, Dongdong ;
Liu, Mengchen ;
Yuan, Lu ;
Zhang, Lei .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :7369-7378
[4]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448
[5]   UN-YOLOv5s: A UAV-Based Aerial Photography Detection Algorithm [J].
Guo, Junmei ;
Liu, Xingchen ;
Bi, Lingyun ;
Liu, Haiying ;
Lou, Haitong .
SENSORS, 2023, 23 (13)
[6]  
Jocher Glenn, 2020, Zenodo
[7]   Microsoft COCO: Common Objects in Context [J].
Lin, Tsung-Yi ;
Maire, Michael ;
Belongie, Serge ;
Hays, James ;
Perona, Pietro ;
Ramanan, Deva ;
Dollar, Piotr ;
Zitnick, C. Lawrence .
COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 :740-755
[8]   Feature Pyramid Networks for Object Detection [J].
Lin, Tsung-Yi ;
Dollar, Piotr ;
Girshick, Ross ;
He, Kaiming ;
Hariharan, Bharath ;
Belongie, Serge .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :936-944
[9]  
Lindeberg Tony., 2012, SCALE INVARIANT FEAT, DOI [10.4249/scholarpedia.10491, DOI 10.4249/SCHOLARPEDIA.10491]
[10]   UAV-YOLO: Small Object Detection on Unmanned Aerial Vehicle Perspective [J].
Liu, Mingjie ;
Wang, Xianhao ;
Zhou, Anjian ;
Fu, Xiuyuan ;
Ma, Yiwei ;
Piao, Changhao .
SENSORS, 2020, 20 (08)