Multi-branch attention mechanism and path enhancement for underwater object detection

被引:0
作者
Wang, Haibo [1 ]
Zhou, Zhiyu [1 ]
机构
[1] Zhejiang Sci Tech Univ, Sch Comp Sci & Technol, Hangzhou 310018, Peoples R China
来源
ENGINEERING RESEARCH EXPRESS | 2025年 / 7卷 / 02期
基金
国家重点研发计划;
关键词
underwater object detection; self-attention mechanism; small-scale object; convolutional neural network;
D O I
10.1088/2631-8695/adc5c5
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Underwater object detection is an important research area with wide-ranging applications, from underwater exploration to ecological monitoring. However, this field faces multiple challenges, particularly the significant degradation of underwater image quality and variations in target scales. Traditional object detection algorithms struggle to accurately extract key features of underwater targets, leading to poor detection performance. This study aims to enhance the performance of underwater object detection, especially for small-scale underwater targets, to adapt to complex underwater environments. In this paper, we propose a novel underwater object detector called MPEDet based on multi-branch attention mechanism and path enhancement. Specifically, to improve the capability of the model to extract key features in complex underwater environments, we propose a multi-branch attention mechanism called MBAM, which fully utilizes the dependency information between input features and input keys to strengthen the semantic representation capability during the encoding phase. In addition, we use the designed path enhancement module to facilitate the information interaction between high and low features and reduce the loss of detailed information in the propagation of high-level features within the network. Finally, after training the proposed MPEDet underwater detector for only 24 epochs, it achieved AP50 values of 84.4% and 74.8% on the RUOD and UTDAC underwater test sets, respectively. The results demonstrate that the proposed MPEDet detector can effectively handle the task of underwater.
引用
收藏
页数:14
相关论文
共 47 条
[1]   Multi-modal interaction with token division strategy for RGB-T tracking [J].
Cai, Yujue ;
Sui, Xiubao ;
Gu, Guohua ;
Chen, Qian .
PATTERN RECOGNITION, 2024, 155
[2]   Cascade R-CNN: Delving into High Quality Object Detection [J].
Cai, Zhaowei ;
Vasconcelos, Nuno .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6154-6162
[3]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[4]   StfMLP: Spatiotemporal Fusion Multilayer Perceptron for Remote-Sensing Images [J].
Chen, Guangsheng ;
Lu, Hailiang ;
Di, Donglin ;
Li, Linhui ;
Emam, Mahmoud ;
Jing, Weipeng .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
[5]   Hybrid Task Cascade for Instance Segmentation [J].
Chen, Kai ;
Pang, Jiangmiao ;
Wang, Jiaqi ;
Xiong, Yu ;
Li, Xiaoxiao ;
Sun, Shuyang ;
Feng, Wansen ;
Liu, Ziwei ;
Shi, Jianping ;
Ouyang, Wanli ;
Loy, Chen Change ;
Lin, Dahua .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4969-4978
[6]   Disentangle Your Dense Object Detector [J].
Chen, Zehui ;
Yang, Chenhongyi ;
Li, Qiaofei ;
Zhao, Feng ;
Zha, Zheng-Jun ;
Wu, Feng .
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, :4939-4948
[7]  
Chu S., Ocean Eng, V318
[8]   A Feature Learning and Object Recognition Framework for Underwater Fish Images [J].
Chuang, Meng-Che ;
Hwang, Jenq-Neng ;
Williams, Kresimir .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (04) :1862-1872
[9]   CenterNet: Keypoint Triplets for Object Detection [J].
Duan, Kaiwen ;
Bai, Song ;
Xie, Lingxi ;
Qi, Honggang ;
Huang, Qingming ;
Tian, Qi .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6568-6577
[10]   Rethinking general underwater object detection: Datasets, challenges, and solutions [J].
Fu, Chenping ;
Liu, Risheng ;
Fan, Xin ;
Chen, Puyang ;
Fu, Hao ;
Yuan, Wanqi ;
Zhu, Ming ;
Luo, Zhongxuan .
NEUROCOMPUTING, 2023, 517 :243-256