Efficient Underwater Object Detection With Enhanced Feature Extraction and Fusion

被引:0
作者
Li, Shaoming [1 ]
Wang, Ziyi [2 ]
Dai, Rong [2 ]
Wang, Yaqing [2 ]
Zhong, Fangxun [1 ]
Liu, Yunhui [2 ]
机构
[1] Chinese Univ Hong Kong, Sch Sci & Engn, Shenzhen 518172, Peoples R China
[2] Chinese Univ Hong Kong, Dept Mech & Automat Engn, Hong Kong 999077, Peoples R China
关键词
Feature extraction; Accuracy; Computational modeling; Attention mechanisms; Optical imaging; Detectors; Robots; Optical attenuators; Image quality; Adaptive systems; deep-learning optimization; feature fusion; underwater object detection (UOD); visual detection; NETWORK;
D O I
10.1109/TII.2025.3547007
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Underwater object detection is critical for applications, such as environmental monitoring, resource exploration, and the navigation of autonomous underwater vehicles. However, accurately detecting small objects in underwater environments remains challenging due to noisy imaging conditions, variable illumination, and complex backgrounds. To address these challenges, we propose the adaptive residual attention network (ARAN), an optimized deep learning framework designed to enhance the detection and precise identification of diminutive targets in complex aquatic settings. ARAN incorporates the proposed Fusion path aggregation network (PANet), which refines spatial features by effectively distinguishing objects from their backgrounds. The framework integrates three novel modules: first, multiscale feature attention, which enhances low-level feature extraction; second, high-low feature residual learning, which rearranges channel and batch dimensions to capture pixel-level relationships through cross-dimensional interactions; and third, multilevel feature dynamic aggregation, which dynamically adjusts fusion weights to facilitate progressive multilevel feature fusion and mitigate conflicts in multiscale integration, ensuring that small objects are not overshadowed. Extensive experiments on four benchmark datasets demonstrate that ARAN significantly outperforms mainstream models, achieving state-of-the-art performance. Notably, on the CSIRO dataset, ARAN attains a mean average precision at 50% of 98%, precision of 94.7%, F2-score of 94.6%, and recall of 94.7%. These results confirm our model's superior accuracy, robustness, and efficiency in underwater object detection, highlighting its potential for practical deployment in challenging aquatic environments. We will release the code on GitHub upon acceptance of the article.
引用
收藏
页码:4904 / 4914
页数:11
相关论文
共 45 条
[11]   Underwater object detection algorithm based on feature enhancement and progressive dynamic aggregation strategy [J].
Hua, Xia ;
Cui, Xiaopeng ;
Xu, Xinghua ;
Qiu, Shaohua ;
Liang, Yingjie ;
Bao, Xianqiang ;
Li, Zhong .
PATTERN RECOGNITION, 2023, 139
[12]   Fast Underwater Image Enhancement for Improved Visual Perception [J].
Islam, Md Jahidul ;
Xia, Youya ;
Sattar, Junaed .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (02) :3227-3234
[13]   The extended marine underwater environment database and baseline evaluations [J].
Jian, Muwei ;
Qi, Qiang ;
Yu, Hui ;
Dong, Junyu ;
Cui, Chaoran ;
Nie, Xiushan ;
Zhang, Huaxiang ;
Yin, Yilong ;
Lam, Kin-Man .
APPLIED SOFT COMPUTING, 2019, 80 :425-437
[14]  
Jocher G., 2023, Tech. Rep.
[15]   Segment Anything [J].
Kirillov, Alexander ;
Mintun, Eric ;
Ravi, Nikhila ;
Mao, Hanzi ;
Rolland, Chloe ;
Gustafson, Laura ;
Xiao, Tete ;
Whitehead, Spencer ;
Berg, Alexander C. ;
Lo, Wan-Yen ;
Dolla'r, Piotr ;
Girshick, Ross .
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, :3992-4003
[16]   LEF-YOLO: a lightweight method for intelligent detection of four extreme wildfires based on the YOLO framework [J].
Li, Jianwei ;
Tang, Huan ;
Li, Xingdong ;
Dou, Hongqiang ;
Li, Ru .
INTERNATIONAL JOURNAL OF WILDLAND FIRE, 2024, 33 (01)
[17]  
Li T., 2022, ARXIV
[18]  
Li Xiu, 2015, FAST ACCURATE FISH D
[19]  
Liao H. Y. M., 2020, ARXIV
[20]   Microsoft COCO: Common Objects in Context [J].
Lin, Tsung-Yi ;
Maire, Michael ;
Belongie, Serge ;
Hays, James ;
Perona, Pietro ;
Ramanan, Deva ;
Dollar, Piotr ;
Zitnick, C. Lawrence .
COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 :740-755