YOLOv7-PE: A Precise and Efficient Enhancement of YOLOv7 for Underwater Target Detection

被引：3

作者：

Li, Zhichuang ^{[1
]}

Xie, Haijun ^{[1
,2
]}

Feng, Jingyi ^{[1
]}

Wang, Zhenbo ^{[1
]}

Yuan, Zizhao ^{[1
]}

机构：

[1] Beijing Inst Technol, Zhuhai 519088, Peoples R China

[2] Guangdong Prov Lab Lingnan Modern Agr Sci & Techno, Heyuan Branch, Heyuan 517000, Peoples R China

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Feature extraction; Computational modeling; Accuracy; Head; Adaptation models; Neck; Computational efficiency; Object detection; Underwater target detection; YOLOv7-PE; efficient decoupled head; anchor-free; CSPSPPF; CBAM;

D O I：

10.1109/ACCESS.2024.3417322

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In underwater target detection tasks, challenges such as image blurring, complex backgrounds, and aggregation of small targets lead to problems such as difficulty in model feature extraction, target leakage, and false detection. In order to improve the accuracy, real-time performance and lightweight of underwater target detection models, we propose YOLOv7-PE: an accurate and efficient YOLOv7 improved model for underwater target detection.YOLOv7-PE is based on the single-stage target detection model YOLOv7 and separates the classification and regression tasks to be processed separately by decoupling the header design to enhance the feature extraction. We also introduce an anchor-free based design, which simplifies the target detection process, reduces the prediction time, and can adapt to targets in underwater environments. And to improve the computational efficiency, we introduce the CSPSPPF module, which reduces the computational cost of the model and improves the inference speed. In addition, we introduce the CBAM attention mechanism to enhance the feature representation in both channel and spatial dimensions. Through extensive qualitative and quantitative analyses, we verified that YOLOv7-PE has higher detection accuracy and efficient performance on the task of target detection in complex underwater environments. Relative to YOLOv7, the the average detection accuracy(mAP) of YOLOv7-PE is improved by 1.23%. Meanwhile, the Frames Per Second(FPS) of the model is improved by 1.52%, while the amount of model parameters is reduced by 6.78%. Our YOLOv7-PE model performs more accurately as well as efficiently compared to other classical target detection models.

引用

页码：133937 / 133951

页数：15

共 50 条

[31] Boosting R-CNN: Reweighting R-CNN samples by RPN?s error for underwater object detection [J].

Song, Pinhao ;

Li, Pengteng ;

Dai, Linhui ;

Wang, Tao ;

Chen, Zhan .

NEUROCOMPUTING, 2023, 530 :150-164

[32] EfficientDet: Scalable and Efficient Object Detection [J].

Tan, Mingxing ;

Pang, Ruoming ;

Le, Quoc, V .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :10778-10787

[33] FCOS: Fully Convolutional One-Stage Object Detection [J].

Tian, Zhi ;

Shen, Chunhua ;

Chen, Hao ;

He, Tong .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9626-9635

[34] YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors [J].

Wang, Chien-Yao ;

Bochkovskiy, Alexey ;

Liao, Hong-Yuan Mark .

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, :7464-7475

[35] CSPNet: A New Backbone that can Enhance Learning Capability of CNN [J].

Wang, Chien-Yao ;

Liao, Hong-Yuan Mark ;

Wu, Yueh-Hua ;

Chen, Ping-Yang ;

Hsieh, Jun-Wei ;

Yeh, I-Hau .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, :1571-1580

[36] PANet: Few-Shot Image Semantic Segmentation with Prototype Alignment [J].

Wang, Kaixin ;

Liew, Jun Hao ;

Zou, Yingtian ;

Zhou, Daquan ;

Feng, Jiashi .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9196-9205

[37] ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks [J].

Wang, Qilong ;

Wu, Banggu ;

Zhu, Pengfei ;

Li, Peihua ;

Zuo, Wangmeng ;

Hu, Qinghua .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11531-11539

[38] CBAM: Convolutional Block Attention Module [J].

Woo, Sanghyun ;

Park, Jongchan ;

Lee, Joon-Young ;

Kweon, In So .

COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :3-19

[39] Rethinking Classification and Localization for Object Detection [J].

Wu, Yue ;

Chen, Yinpeng ;

Yuan, Lu ;

Liu, Zicheng ;

Wang, Lijuan ;

Li, Hongzhi ;

Fu, Yun .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :10183-10192

[40] Underwater object detection algorithm based on attention mechanism and cross-stage partial fast spatial pyramidal pooling [J].

Yan, Jinghui ;

Zhou, Zhuang ;

Zhou, Dujuan ;

Su, Binghua ;

Zhe, Xuanyuan ;

Tang, Jialin ;

Lai, Yunting ;

Chen, Jiongjiang ;

Liang, Wanxin .

FRONTIERS IN MARINE SCIENCE, 2022, 9

← 1 2 3 4 5 →