YOLOv7-PE: A Precise and Efficient Enhancement of YOLOv7 for Underwater Target Detection

被引:3
作者
Li, Zhichuang [1 ]
Xie, Haijun [1 ,2 ]
Feng, Jingyi [1 ]
Wang, Zhenbo [1 ]
Yuan, Zizhao [1 ]
机构
[1] Beijing Inst Technol, Zhuhai 519088, Peoples R China
[2] Guangdong Prov Lab Lingnan Modern Agr Sci & Techno, Heyuan Branch, Heyuan 517000, Peoples R China
关键词
Feature extraction; Computational modeling; Accuracy; Head; Adaptation models; Neck; Computational efficiency; Object detection; Underwater target detection; YOLOv7-PE; efficient decoupled head; anchor-free; CSPSPPF; CBAM;
D O I
10.1109/ACCESS.2024.3417322
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In underwater target detection tasks, challenges such as image blurring, complex backgrounds, and aggregation of small targets lead to problems such as difficulty in model feature extraction, target leakage, and false detection. In order to improve the accuracy, real-time performance and lightweight of underwater target detection models, we propose YOLOv7-PE: an accurate and efficient YOLOv7 improved model for underwater target detection.YOLOv7-PE is based on the single-stage target detection model YOLOv7 and separates the classification and regression tasks to be processed separately by decoupling the header design to enhance the feature extraction. We also introduce an anchor-free based design, which simplifies the target detection process, reduces the prediction time, and can adapt to targets in underwater environments. And to improve the computational efficiency, we introduce the CSPSPPF module, which reduces the computational cost of the model and improves the inference speed. In addition, we introduce the CBAM attention mechanism to enhance the feature representation in both channel and spatial dimensions. Through extensive qualitative and quantitative analyses, we verified that YOLOv7-PE has higher detection accuracy and efficient performance on the task of target detection in complex underwater environments. Relative to YOLOv7, the the average detection accuracy(mAP) of YOLOv7-PE is improved by 1.23%. Meanwhile, the Frames Per Second(FPS) of the model is improved by 1.52%, while the amount of model parameters is reduced by 6.78%. Our YOLOv7-PE model performs more accurately as well as efficiently compared to other classical target detection models.
引用
收藏
页码:133937 / 133951
页数:15
相关论文
共 50 条
[1]   Sea-thru: A Method For Removing Water From Underwater Images [J].
Akkaynak, Derya ;
Treibitz, Tali .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :1682-1691
[2]  
Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, 10.48550/arXiv.2004.10934, DOI 10.48550/ARXIV.2004.10934]
[3]   Joint image enhancement learning for marine object detection in natural scene [J].
Cheng, Na ;
Xie, Hongye ;
Zhu, Xuanbing ;
Wang, Hongyu .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 120
[4]   Dense Attention Pyramid Networks for Multi-Scale Ship Detection in SAR Images [J].
Cui, Zongyong ;
Li, Qi ;
Cao, Zongjie ;
Liu, Nengyuan .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2019, 57 (11) :8983-8997
[5]   CenterNet: Keypoint Triplets for Object Detection [J].
Duan, Kaiwen ;
Bai, Song ;
Xie, Lingxi ;
Qi, Honggang ;
Huang, Qingming ;
Tian, Qi .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6568-6577
[6]   Learning Heavily-Degraded Prior for Underwater Object Detection [J].
Fu, Chenping ;
Fan, Xin ;
Xiao, Jiewen ;
Yuan, Wanqi ;
Liu, Risheng ;
Luo, Zhongxuan .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (11) :6887-6896
[7]  
Ge Z, 2021, Arxiv, DOI arXiv:2107.08430
[8]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448
[9]   Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (09) :1904-1916
[10]   Coordinate Attention for Efficient Mobile Network Design [J].
Hou, Qibin ;
Zhou, Daquan ;
Feng, Jiashi .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13708-13717