Fine-tuned YOLOv5 for real-time vehicle detection in UAV imagery: Architectural improvements and performance boost

被引:59
作者
Hamzenejadi, Mohammad Hossein [1 ]
Mohseni, Hadis [1 ]
机构
[1] Shahid Bahonar Univ Kerman, Comp Engn Dept, Kerman, Iran
关键词
YOLOv5; UAV imagery; Vehicle detection; Object detection; Real-time; ATTENTION; NETWORKS;
D O I
10.1016/j.eswa.2023.120845
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nowadays, Unmanned Aerial Vehicles (UAVs) have become useful for various civil applications, such as traffic monitoring and smart parkings, where real-time vehicle detection and classification is one of the key tasks. There are many challenges in detecting vehicles including small size objects and the variety in the UAV's altitude and angle. As classic object detection solutions have limitations in confronting these challenges, recent methods are developed based on convolutional neural networks and their ability in effective feature learning. Due to the computational complexity in these networks and the need for accurate and real-time object detection, balancing the accuracy and inference speed is obligatory for efficiency. This paper aims to propose an accurate, efficient and real-time vehicle detection network based on the successful YOLOv5 object detection model. This is done by improving the structure of the model, adding attention mechanism and using an adaptive bounding box regression loss function. Also, considering the need for real-time inference speed, the depth and width of the model was balanced and ghost convolution was incorporated into the Neck unit to further improve the balance between accuracy and inference speed. The proposed method is evaluated on three different urban UAV imagery datasets, VisDrone, CARPK and VAID, specifically intended for civil applications. Comparing the obtained results from the proposed method with YOLOv5 baseline models, it achieved 3.52% higher mAP50 and 207.15% higher FPS than YOLOv5X on VisDrone dataset, while it is much smaller in size and GFLOPS. Totally, the proposed network outcomes show how the applied structural and conceptual modifications can upgrade the YOLO family towards being small in size, high in accuracy and fast in inference speed.
引用
收藏
页数:18
相关论文
共 59 条
[1]   Moving vehicle detection and tracking at roundabouts using deep learning with trajectory union [J].
Avsar, Ercan ;
Avsar, Yagmur Ozinal .
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (05) :6653-6680
[2]  
Bochkovskiy A, 2020, Arxiv, DOI arXiv:2004.10934
[3]   AN ANALYSIS OF TRANSFORMATIONS [J].
BOX, GEP ;
COX, DR .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1964, 26 (02) :211-252
[4]   Guided Attention Network for Object Detection and Counting on Drones [J].
Cai, Yuanqiang ;
Du, Dawei ;
Zhang, Libo ;
Wen, Longyin ;
Wang, Weiqiang ;
Wu, Yanjun ;
Lyu, Siwei .
MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, :709-717
[5]   A Survey of Computer Vision Methods for 2D Object Detection from Unmanned Aerial Vehicles [J].
Cazzato, Dario ;
Cimarelli, Claudio ;
Sanchez-Lopez, Jose Luis ;
Voos, Holger ;
Leo, Marco .
JOURNAL OF IMAGING, 2020, 6 (08)
[6]   Xception: Deep Learning with Depthwise Separable Convolutions [J].
Chollet, Francois .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1800-1807
[7]   LES-YOLO: A lightweight pinecone detection algorithm based on improved YOLOv4-Tiny network [J].
Cui, Mingdi ;
Lou, Yunyi ;
Ge, Yilin ;
Wang, Keqi .
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2023, 205
[8]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[9]   SI-EDTL: Swarm intelligence ensemble deep transfer learning for multiple vehicle detection in UAV images [J].
Darehnaei, Zeinab Ghasemi ;
Shokouhifar, Mohammad ;
Yazdanjouei, Hossein ;
Fatemi, Seyed Mohammad Jalal Rastegar .
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (05)
[10]   The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking [J].
Du, Dawei ;
Qi, Yuankai ;
Yu, Hongyang ;
Yang, Yifan ;
Duan, Kaiwen ;
Li, Guorong ;
Zhang, Weigang ;
Huang, Qingming ;
Tian, Qi .
COMPUTER VISION - ECCV 2018, PT X, 2018, 11214 :375-391