NAS-YOLOX: a SAR ship detection using neural architecture search and multi-scale attention

被引:108
作者
Wang, Hao [1 ]
Han, Dezhi [1 ]
Cui, Mingming [1 ]
Chen, Chongqing [1 ]
机构
[1] Shanghai Maritime Univ, Coll Informat Engn, Shanghai, Peoples R China
关键词
Synthetic aperture radar (SAR); ship detection; you only look once version X (YOLOX); neural architecture search-feature pyramid network (NAS-FPN); NETWORK; IMAGES; TARGETS;
D O I
10.1080/09540091.2023.2257399
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the advantages of all-weather capability and high resolution, synthetic aperture radar (SAR) image ship detection has been widely applied in the military, civilian, and other domains. However, SAR-based ship detection suffers from limitations such as strong scattering of targets, multiple scales, and background interference, leading to low detection accuracy. To address these limitations, this paper presents a novel SAR ship detection method, NAS-YOLOX, which leverages the efficient feature fusion of the neural architecture search feature pyramid network (NAS-FPN) and the effective feature extraction of the multi-scale attention mechanism. Specifically, NAS-FPN replaces the PAFPN in the baseline YOLOX, greatly enhances the fusion performance of the model's multi-scale feature information, and a dilated convolution feature enhancement module (DFEM) is designed and integrated into the backbone network to improve the network's receptive field and target information extraction capabilities. Furthermore, a multi-scale channel-spatial attention (MCSA) mechanism is conceptualised to enhance focus on target regions, improve small-scale target detection, and adapt to multi-scale targets. Additionally, extensive experiments conducted on benchmark datasets, HRSID and SSDD, demonstrate that NAS-YOLOX achieves comparable or superior performance compared to other state-of-the-art ship detection models and reaches best accuracies of 91.1% and 97.2% on AP0.5, respectively.
引用
收藏
页码:1 / 32
页数:32
相关论文
共 64 条
  • [1] Cascade R-CNN: Delving into High Quality Object Detection
    Cai, Zhaowei
    Vasconcelos, Nuno
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6154 - 6162
  • [2] CLVIN: Complete language-vision interaction network for visual question answering
    Chen, Chongqing
    Han, Dezhi
    Shen, Xiang
    [J]. KNOWLEDGE-BASED SYSTEMS, 2023, 275
  • [3] CAAN: Context-Aware attention network for visual question answering
    Chen, Chongqing
    Han, Dezhi
    Chang, Chin -Chen
    [J]. PATTERN RECOGNITION, 2022, 132
  • [4] Chen LC, 2017, Arxiv, DOI arXiv:1706.05587
  • [5] A Novel Deep Learning Network with Deformable Convolution and Attention Mechanisms for Complex Scenes Ship Detection in SAR Images
    Chen, Peng
    Zhou, Hui
    Li, Ying
    Liu, Peng
    Liu, Bingxin
    [J]. REMOTE SENSING, 2023, 15 (10)
  • [6] You Only Look One-level Feature
    Chen, Qiang
    Wang, Yingming
    Yang, Tong
    Zhang, Xiangyu
    Cheng, Jian
    Sun, Jian
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 13034 - 13043
  • [7] Learning Slimming SAR Ship Object Detector Through Network Pruning and Knowledge Distillation
    Chen, Shiqi
    Zhan, Ronghui
    Wang, Wei
    Zhang, Jun
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 : 1267 - 1282
  • [8] Attentional Feature Fusion
    Dai, Yimian
    Gieseke, Fabian
    Oehmcke, Stefan
    Wu, Yiquan
    Barnard, Kobus
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 3559 - 3568
  • [9] CenterNet: Keypoint Triplets for Object Detection
    Duan, Kaiwen
    Bai, Song
    Xie, Lingxi
    Qi, Honggang
    Huang, Qingming
    Tian, Qi
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6568 - 6577
  • [10] Farah F., 2022, 2022 7 INT C IM SIGN