Maritime Small Object Detection Algorithm in Drone Aerial Images Based on Improved YOLOv8

被引：0

作者：

Ling, Peng ^{[1
]}

Zhang, Yihong ^{[2
]}

Ma, Shuai ^{[1
]}

机构：

[1] Donghua Univ, Coll Informat Sci & Technol, Shanghai 201620, Peoples R China

[2] Donghua Univ, Coll Informat Sci & Technol, Engn Res Ctr Digitized Text & Fash Technol, Minist Educ, Shanghai 201620, Peoples R China

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Feature extraction; Convolution; Object detection; Neck; Autonomous aerial vehicles; Drones; Computational modeling; Kernel; Accuracy; Deep learning; YOLOv8; maritime object detection; UAV images; lightweight network; dilation-wise residual;

D O I：

10.1109/ACCESS.2024.3490610

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Combining unmanned aerial vehicles (UAVs) with deep learning algorithms offers an efficient, safe and inexpensive alternative to maritime search and rescue (mSAR) missions. Maritime UAV images present unique challenges for object detection due to their complex nature, including dense distribution, multi-scale objects and occlusion. Aiming to address this problem, we propose a novel lightweight model specifically designed for maritime small object detection, named AB2D-YOLO. Firstly, the attention based intra-scale feature interaction (AIFI) module is used to replace the spatial pyramid pooling fast (SPPF) module on the backbone, enhancing the detection precision of occluded and densely small targets by integrating global and contextual feature information. Secondly, the dilation-wise residual (DWR) module is integrated into the network. The module employs three sets of dilated convolution with different sampling rates to obtain multi-scale receptive fields, which effectively improves the capacity for detecting multi-scale objects. Then, we propose an improved network fusion model based on weighted bi-directional feature pyramid network (BiFPN) to reconstruct the neck, which can enhance the features of small targets through weighted fusion of feature information of different scales and bidirectional cross-scale connection. Finally, we add a new detection layer in the neck to capture more object location information in images. When compared to the benchmark model YOLOv8s, AB2D-YOLO achieves an 8.96% increase in mean average precision (mAP) on the SeaDroneSee dataset, while maintaining a low model complexity with only 6.95 MB of parameters. When compared to state-of-the-art models, AB2D-YOLO model is conducive to the deployment of maritime UAV.

引用

页码：176527 / 176538

页数：12

共 33 条

[1] Object Detection Using Deep Learning, CNNs and Vision Transformers: A Review [J].

Amjoud, Ayoub Benali ;

Amrouch, Mustapha .

IEEE ACCESS, 2023, 11 :35479-35516

[2] R-CNN for Small Object Detection [J].

Chen, Chenyi ;

Liu, Ming-Yu ;

Tuzel, Oncel ;

Xiao, Jianxiong .

COMPUTER VISION - ACCV 2016, PT V, 2017, 10115 :214-230

[3] Deep learning based efficient ship detection from drone-captured images for maritime surveillance [J].

Cheng, Shuxiao ;

Zhu, Yishuang ;

Wu, Shaohua .

OCEAN ENGINEERING, 2023, 285

[4] Sw-YoloX: An anchor-free detector based transformer for sea surface object detection [J].

Ding, Jiangang ;

Li, Wei ;

Pei, Lili ;

Yang, Ming ;

Ye, Chao ;

Yuan, Bo .

EXPERT SYSTEMS WITH APPLICATIONS, 2023, 217

[5]

Ding XH, 2024, Arxiv, DOI [arXiv:2311.15599, 10.48550/arXiv.2311.15599]

[6] Rich feature hierarchies for accurate object detection and semantic segmentation [J].

Girshick, Ross ;

Donahue, Jeff ;

Darrell, Trevor ;

Malik, Jitendra .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587

[7] Fine-tuned YOLOv5 for real-time vehicle detection in UAV imagery: Architectural improvements and performance boost [J].

Hamzenejadi, Mohammad Hossein ;

Mohseni, Hadis .

EXPERT SYSTEMS WITH APPLICATIONS, 2023, 231

[8] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[9] Real-Time Vehicle Detection from UAV Aerial Images Based on Improved YOLOv5 [J].

Li, Shuaicai ;

Yang, Xiaodong ;

Lin, Xiaoxia ;

Zhang, Yanyi ;

Wu, Jiahui .

SENSORS, 2023, 23 (12)

[10] Microsoft COCO: Common Objects in Context [J].

Lin, Tsung-Yi ;

Maire, Michael ;

Belongie, Serge ;

Hays, James ;

Perona, Pietro ;

Ramanan, Deva ;

Dollar, Piotr ;

Zitnick, C. Lawrence .

COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 :740-755

← 1 2 3 4 →