Rethinking general underwater object detection: Datasets, challenges, and solutions

被引：121

作者：

Fu, Chenping ^{[1
]}

Liu, Risheng ^{[1
,2
]}

Fan, Xin ^{[1
]}

Chen, Puyang ^{[1
]}

Fu, Hao ^{[1
]}

Yuan, Wanqi ^{[1
]}

Zhu, Ming ^{[1
]}

Luo, Zhongxuan ^{[1
]}

机构：

[1] Dalian Univ Technol, Dalian 116024, Peoples R China

[2] Peng Cheng Lab, Shenzhen 518055, Peoples R China

来源：

NEUROCOMPUTING | 2023年 / 517卷

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

Underwater object detection; Image enhancement; Benchmark; IMAGE-ENHANCEMENT;

D O I：

10.1016/j.neucom.2022.10.039

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we conduct a comprehensive study of Underwater Object Detection (UOD). UOD has evolved into an attractive research field in the computer vision community in recent years. However, existing UOD datasets collected from specific underwater scenes are limited in the number of images, categories, resolution, and environmental challenges. These limitations can lead to the settings and effec-tiveness of models trained on existing datasets being impaired in general underwater situations. These limitations also constrain the comprehensive exploration of UOD. To alleviate these issues, we first present a new real-world UOD dataset called RUOD that places UOD in the context of general scene under-standing. The dataset contains 14,000 high-resolution images, 74,903 labeled objects, and 10 common aquatic categories. The dataset also has various marine objects and rich environmental challenges includ-ing haze-like effects, color casts, and light interference. Second, we conduct extensive and systematic experiments on RUOD to evaluate the development of general underwater scene detection from the per-spective of algorithms, complex marine objects, and environmental challenges. The findings from these explorations highlight the challenges of UOD and suggest promising solutions and new directions for UOD. Finally, UOD in practice typically uses underwater image enhancement during preprocessing to improve image quality. We thus characterize object detection performance on enhanced images and find an effective auxiliary framework of image enhancement for UOD. Our dataset is available at https://git hub.com/dlut-dimt/RUOD.(c) 2022 Elsevier B.V. All rights reserved.

引用

页码：243 / 256

页数：14

共 49 条

[1] Cascade R-CNN: Delving into High Quality Object Detection [J].

Cai, Zhaowei ;

Vasconcelos, Nuno .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6154-6162

[2] Perceptual Underwater Image Enhancement With Deep Learning and Physical Priors [J].

Chen, Long ;

Jiang, Zheheng ;

Tong, Lei ;

Liu, Zhihua ;

Zhao, Aite ;

Zhang, Qianni ;

Dong, Junyu ;

Zhou, Huiyu .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (08) :3078-3092

[3] You Only Look One-level Feature [J].

Chen, Qiang ;

Wang, Yingming ;

Yang, Tong ;

Zhang, Xiangyu ;

Cheng, Jian ;

Sun, Jian .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13034-13043

[4]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[5] CenterNet: Keypoint Triplets for Object Detection [J].

Duan, Kaiwen ;

Bai, Song ;

Xie, Lingxi ;

Qi, Honggang ;

Huang, Qingming ;

Tian, Qi .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6568-6577

[6]

Everingham M., 2010, International journal of computer vision, V88, P303, DOI DOI 10.1007/s11263-009-0275-4

[7] Dual Refinement Underwater Object Detection Network [J].

Fan, Baojie ;

Chen, Wei ;

Cong, Yang ;

Tian, Jiandong .

COMPUTER VISION - ECCV 2020, PT XX, 2020, 12365 :275-291

[8] NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection [J].

Ghiasi, Golnaz ;

Lin, Tsung-Yi ;

Le, Quoc V. .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7029-7038

[9]

He J., 2013, P 10 C OP RES AR INF, P211

[10] Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training [J].

Zhang, Hongkai ;

Chang, Hong ;

Ma, Bingpeng ;

Wang, Naiyan ;

Chen, Xilin .

COMPUTER VISION - ECCV 2020, PT XV, 2020, 12360 :260-275

← 1 2 3 4 5 →