Underwater Object Detection Using TC-YOLO with Attention Mechanisms

被引:43
作者
Liu, Kun [1 ]
Peng, Lei [1 ]
Tang, Shanran [1 ]
机构
[1] South China Univ Technol, Sch Civil Engn & Transportat, Guangzhou 510641, Peoples R China
关键词
object detection; underwater image; YOLOv5; coordinate attention; transformer; ADAPTIVE HISTOGRAM EQUALIZATION;
D O I
10.3390/s23052567
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Underwater object detection is a key technology in the development of intelligent underwater vehicles. Object detection faces unique challenges in underwater applications: blurry underwater images; small and dense targets; and limited computational capacity available on the deployed platforms. To improve the performance of underwater object detection, we proposed a new object detection approach that combines a new detection neural network called TC-YOLO, an image enhancement technique using an adaptive histogram equalization algorithm, and the optimal transport scheme for label assignment. The proposed TC-YOLO network was developed based on YOLOv5s. Transformer self-attention and coordinate attention were adopted in the backbone and neck of the new network, respectively, to enhance feature extraction for underwater objects. The application of optimal transport label assignment enables a significant reduction in the number of fuzzy boxes and improves the utilization of training data. Our tests using the RUIE2020 dataset and ablation experiments demonstrate that the proposed approach performs better than the original YOLOv5s and other similar networks for underwater object detection tasks; moreover, the size and computational cost of the proposed model remain small for underwater mobile applications.
引用
收藏
页数:15
相关论文
共 59 条
[1]   YOLO-Fish: A robust fish detection model to detect fish in realistic underwater environment [J].
Al Muksit, Abdullah ;
Hasan, Fakhrul ;
Emon, Md. Fahad Hasan Bhuiyan ;
Haque, Md Rakibul ;
Anwary, Arif Reza ;
Shatabda, Swakkhar .
ECOLOGICAL INFORMATICS, 2022, 72
[2]  
[Anonymous], 2014, International Journal of Computer Applications, DOI 10.5120/15268-3743
[3]  
Bochkovskiy A., 2020, YOLOv4: Optimal Speed and Accuracy of Object Detection, Vabs/2004.10934, P1
[4]  
Dosovitskiy A., 2021, arXiv
[5]  
Ge Z., 2021, ARXIV, DOI 10.48550/ARXIV.2107.08430
[6]   OTA: Optimal Transport Assignment for Object Detection [J].
Ge, Zheng ;
Liu, Songtao ;
Liu, Zeming ;
Yoshie, Osamu ;
Sun, Jian .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :303-312
[7]   LLA: Loss-aware label assignment for dense pedestrian detection [J].
Ge, Zheng ;
Wang, Jianfeng ;
Huang, Xin ;
Liu, Songtao ;
Yoshie, Osamu .
NEUROCOMPUTING, 2021, 462 :272-281
[8]   Nonparametric Variational Auto-encoders for Hierarchical Representation Learning [J].
Goyal, Prasoon ;
Hu, Zhiting ;
Liang, Xiaodan ;
Wang, Chenyu ;
Xing, Eric P. .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5104-5112
[9]   Underwater Image Processing and Object Detection Based on Deep CNN Method [J].
Han, Fenglei ;
Yao, Jingzheng ;
Zhu, Haitao ;
Wang, Chunhui .
JOURNAL OF SENSORS, 2020, 2020
[10]   Deep Supervised Residual Dense Network for Underwater Image Enhancement [J].
Han, Yanling ;
Huang, Lihua ;
Hong, Zhonghua ;
Cao, Shouqi ;
Zhang, Yun ;
Wang, Jing .
SENSORS, 2021, 21 (09)