Sw-YoloX: An anchor-free detector based transformer for sea surface object detection

被引:16
作者
Ding, Jiangang [1 ]
Li, Wei [1 ]
Pei, Lili [1 ]
Yang, Ming [1 ]
Ye, Chao [1 ]
Yuan, Bo [1 ]
机构
[1] Changan Univ, Sch Informat Engn, Xian 710064, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Sea surface object detection; Sw-YoloX; Transformer; YoloX; Self-training classifier;
D O I
10.1016/j.eswa.2023.119560
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To cope with the challenge of blurred images of sea surface objects caused by the complex and undulating sea surface environment, we propose Sw-YoloX, which can utilize the global modeling ability to encode the key semantics of sea surface objects, thereby obtaining global features that cannot be captured by CNN. Then the convolutional block attention module (CBAM) and atrous spatial pyramid pooling (ASPP) module are integrated in the neck of the detector, and the decoupled head is used as the prediction part. In addition, we also integrate multiple training strategies to effectively improve the detector performance, such as simple optimal transport assignment (SimOTA) strategy and multi-model integration. Finally, we construct the XM-10000 dataset for validation based on sea surface monitoring data in Xiamen, China. With end-to-end training, Sw-YoloX achieves higher performance than baseline and mainstream detector, with F1-Score is 78.1, mean average precision (mAP) is 54.4, and average recall (AR) is 72.0. This research, which has now been deployed in the coastal defense department in Xiamen, China, has important implications for searching for survivors and preventing smuggling.
引用
收藏
页数:11
相关论文
共 40 条
[1]   Cascade R-CNN: Delving into High Quality Object Detection [J].
Cai, Zhaowei ;
Vasconcelos, Nuno .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6154-6162
[2]  
Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
[3]   mSODANet: A network for multi-scale object detection in aerial images using hierarchical dilated convolutions * [J].
Chalavadi, Vishnu ;
Jeripothula, Prudviraj ;
Datla, Rajeshreddy ;
Babu, Sobhan Ch ;
Mohan, Krishna C. .
PATTERN RECOGNITION, 2022, 126
[4]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[5]  
Chernomorets D. A., 2021, INT J NONLINEAR ANAL, DOI [10.22075/ijnaa.2021.25012.2883, DOI 10.22075/IJNAA.2021.25012.2883]
[6]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338
[7]  
Ge Z, 2021, Arxiv, DOI [arXiv:2107.08430, DOI 10.48550/ARXIV.2107.08430]
[8]   Detection of Small Floating Targets on the Sea Surface Based on Multi-Features and Principal Component Analysis [J].
Gu, Tianchang .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2020, 17 (05) :809-813
[9]   A CenterNet plus plus model for ship detection in SAR images [J].
Guo, Haoyuan ;
Yang, Xi ;
Wang, Nannan ;
Gao, Xinbo .
PATTERN RECOGNITION, 2021, 112
[10]   A remote sensing ship recognition method based on dynamic probability generative model [J].
Guo, Weiya ;
Xia, Xuezhi ;
Wang Xiaofei .
EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (14) :6446-6458