CViTF-Net: A Convolutional and Visual Transformer Fusion Network for Small Ship Target Detection in Synthetic Aperture Radar Images

被引:6
作者
Huang, Min [1 ,2 ]
Liu, Tianen [2 ]
Chen, Yazhou [1 ]
机构
[1] Army Engn Univ, Shijiazhuang Campus, Shijiazhuang 050003, Peoples R China
[2] Hebei Univ Sci & Technol, Shijiazhuang 050018, Peoples R China
关键词
synthetic aperture radar (SAR); small ship targets; transformer; convolutional and visual transformer fusion network (CViTF-Net); ship detection; SAR IMAGES;
D O I
10.3390/rs15184373
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Detecting small ship targets in large-scale synthetic aperture radar (SAR) images with complex backgrounds is challenging. This difficulty arises due to indistinct visual features and noise interference. To address these issues, we propose a novel two-stage detector, namely a convolutional and visual transformer fusion network (CViTF-Net), and enhance its detection performance through three innovative modules. Firstly, we designed a pyramid structured CViT backbone. This design leverages convolutional blocks to extract low-level and local features, while utilizing transformer blocks to capture inter-object dependencies over larger image regions. As a result, the CViT backbone adeptly integrates local and global information to bolster the feature representation capacity of targets. Subsequently, we proposed the Gaussian prior discrepancy (GPD) assigner. This assigner employs the discrepancy of Gaussian distributions in two dimensions to assess the degree of matching between priors and ground truth values, thus refining the discriminative criteria for positive and negative samples. Lastly, we designed the level synchronized attention mechanism (LSAM). This mechanism simultaneously considers information from multiple layers in region of interest (RoI) feature maps, and adaptively adjusts the weights of diverse regions within the final RoI. As a result, it enhances the capability to capture both target details and contextual information. We achieved the highest comprehensive evaluation results for the public LS-SSDD-v1.0 dataset, with an mAP of 79.7% and an F1 of 80.8%. In addition, the robustness of the CViTF-Net was validated using the public SSDD dataset. Visualization of the experimental results indicated that CViTF-Net can effectively enhance the detection performance for small ship targets in complex scenes.
引用
收藏
页数:26
相关论文
共 66 条
[1]   Robust CFAR Ship Detector Based on Bilateral-Trimmed-Statistics of Complex Ocean Scenes in SAR Imagery: A Closed-Form Solution [J].
Ai, Jiaqiu ;
Mao, Yuxiang ;
Luo, Qiwu ;
Xing, Mengdao ;
Jiang, Kai ;
Jia, Lu ;
Yang, Xingming .
IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2021, 57 (03) :1872-1890
[2]   Feature Enhancement Pyramid and Shallow Feature Reconstruction Network for SAR Ship Detection [J].
Bai, Lin ;
Yao, Cheng ;
Ye, Zhen ;
Xue, Dongling ;
Lin, Xiangyuan ;
Hui, Meng .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 :1042-1056
[3]   Boosting Ship Detection in SAR Images With Complementary Pretraining Techniques [J].
Bao, Wei ;
Huang, Meiyu ;
Zhang, Yaqin ;
Xu, Yao ;
Liu, Xuejiao ;
Xiang, Xueshuang .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 :8941-8954
[4]   Cascade R-CNN: Delving into High Quality Object Detection [J].
Cai, Zhaowei ;
Vasconcelos, Nuno .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6154-6162
[5]   A Novel Multi-Angle SAR Imaging System and Method Based on an Ultrahigh Speed Platform [J].
Chang, Wensheng ;
Tao, Haihong ;
Sun, Guangcai ;
Wang, Yuqi ;
Bao, Zheng .
SENSORS, 2019, 19 (07)
[6]   Disparity-Based Multiscale Fusion Network for Transportation Detection [J].
Chen, Jing ;
Wang, Qichao ;
Peng, Weiming ;
Xu, Haitao ;
Li, Xiaodong ;
Xu, Wenqiang .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (10) :18855-18863
[7]  
Chen K, 2019, Arxiv, DOI arXiv:1906.07155
[8]   A new CFAR algorithm based on variable window for ship target detection in SAR images [J].
Chen, Shiyuan ;
Li, Xiaojiang .
SIGNAL IMAGE AND VIDEO PROCESSING, 2019, 13 (04) :779-786
[9]   Deformable Convolutional Networks [J].
Dai, Jifeng ;
Qi, Haozhi ;
Xiong, Yuwen ;
Li, Yi ;
Zhang, Guodong ;
Hu, Han ;
Wei, Yichen .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773
[10]  
Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929