Improved Ship Detection with YOLOv8 Enhanced with MobileViT and GSConv

被引:37
作者
Zhao, Xuemeng [1 ]
Song, Yinglei [1 ]
机构
[1] Jiangsu Univ Sci & Technol, Sch Sci, Zhenjiang 212003, Peoples R China
关键词
ship detection; object detection; YOLOv8; MobileViT; GSConv;
D O I
10.3390/electronics12224666
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In tasks that require ship detection and recognition, the irregular shapes of ships and complex backgrounds pose significant challenges. This paper presents an advanced extension of the YOLOv8 model to address these challenges. A lightweight visual transformer, MobileViTSF, is proposed and combined with the YOLOv8 model. To address the loss of semantic information that arises from inconsistent scales in the detection of small ships, a layer intended for the detection of small targets is introduced to lead to improved fusion of deep and shallow features. Furthermore, the traditional convolution (Conv) blocks are replaced with GSConv blocks, and a novel GSC2f block is designed for fewer model parameters and improved detection performance. Experiments on a benchmark dataset suggest that this new model can achieve significantly improved accuracy for ship detection with fewer model parameters and a reduced model size. A comparison with several other state-of-the-art methods shows that higher accuracy can be obtained for ship detection with this model. Moreover, this new model is suitable for edge computing devices, demonstrating practical application value.
引用
收藏
页数:16
相关论文
共 50 条
[31]  
Redmon J., 2017, YOLO9000 BETTER FAST, P7271
[32]  
Redmon J., 2018, Yolov3: An incremental improvement. arXiv 2018
[33]   You Only Look Once: Unified, Real-Time Object Detection [J].
Redmon, Joseph ;
Divvala, Santosh ;
Girshick, Ross ;
Farhadi, Ali .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :779-788
[34]   Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks [J].
Ren, Shaoqing ;
He, Kaiming ;
Girshick, Ross ;
Sun, Jian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (06) :1137-1149
[35]   MobileNetV2: Inverted Residuals and Linear Bottlenecks [J].
Sandler, Mark ;
Howard, Andrew ;
Zhu, Menglong ;
Zhmoginov, Andrey ;
Chen, Liang-Chieh .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4510-4520
[36]   Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization [J].
Selvaraju, Ramprasaath R. ;
Cogswell, Michael ;
Das, Abhishek ;
Vedantam, Ramakrishna ;
Parikh, Devi ;
Batra, Dhruv .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :618-626
[37]   Saliency-Aware Convolution Neural Network for Ship Detection in Surveillance Video [J].
Shao, Zhenfeng ;
Wang, Linggang ;
Wang, Zhongyuan ;
Du, Wan ;
Wu, Wenjing .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (03) :781-794
[38]   SeaShips: A Large-Scale Precisely Annotated Dataset for Ship Detection [J].
Shao, Zhenfeng ;
Wu, Wenjing ;
Wang, Zhongyuan ;
Du, Wan ;
Li, Chengyuan .
IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (10) :2593-2604
[39]   EfficientDet: Scalable and Efficient Object Detection [J].
Tan, Mingxing ;
Pang, Ruoming ;
Le, Quoc, V .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :10778-10787
[40]  
Terven JR, 2024, Arxiv, DOI arXiv:2304.00501