NAS-YOLOX: a SAR ship detection using neural architecture search and multi-scale attention

被引：108

作者：

Wang, Hao ^{[1
]}

Han, Dezhi ^{[1
]}

Cui, Mingming ^{[1
]}

Chen, Chongqing ^{[1
]}

机构：

[1] Shanghai Maritime Univ, Coll Informat Engn, Shanghai, Peoples R China

来源：

CONNECTION SCIENCE | 2023年 / 35卷 / 01期

关键词：

Synthetic aperture radar (SAR); ship detection; you only look once version X (YOLOX); neural architecture search-feature pyramid network (NAS-FPN); NETWORK; IMAGES; TARGETS;

D O I：

10.1080/09540091.2023.2257399

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Due to the advantages of all-weather capability and high resolution, synthetic aperture radar (SAR) image ship detection has been widely applied in the military, civilian, and other domains. However, SAR-based ship detection suffers from limitations such as strong scattering of targets, multiple scales, and background interference, leading to low detection accuracy. To address these limitations, this paper presents a novel SAR ship detection method, NAS-YOLOX, which leverages the efficient feature fusion of the neural architecture search feature pyramid network (NAS-FPN) and the effective feature extraction of the multi-scale attention mechanism. Specifically, NAS-FPN replaces the PAFPN in the baseline YOLOX, greatly enhances the fusion performance of the model's multi-scale feature information, and a dilated convolution feature enhancement module (DFEM) is designed and integrated into the backbone network to improve the network's receptive field and target information extraction capabilities. Furthermore, a multi-scale channel-spatial attention (MCSA) mechanism is conceptualised to enhance focus on target regions, improve small-scale target detection, and adapt to multi-scale targets. Additionally, extensive experiments conducted on benchmark datasets, HRSID and SSDD, demonstrate that NAS-YOLOX achieves comparable or superior performance compared to other state-of-the-art ship detection models and reaches best accuracies of 91.1% and 97.2% on AP0.5, respectively.

引用

页码：1 / 32

页数：32

共 64 条

[1] Cascade R-CNN: Delving into High Quality Object Detection
Cai, Zhaowei
Vasconcelos, Nuno
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6154 - 6162
[2] CLVIN: Complete language-vision interaction network for visual question answering
Chen, Chongqing
Han, Dezhi
Shen, Xiang
[J]. KNOWLEDGE-BASED SYSTEMS, 2023, 275
[3] CAAN: Context-Aware attention network for visual question answering
Chen, Chongqing
Han, Dezhi
Chang, Chin -Chen
[J]. PATTERN RECOGNITION, 2022, 132
[4] Chen LC, 2017, Arxiv, DOI arXiv:1706.05587
[5] A Novel Deep Learning Network with Deformable Convolution and Attention Mechanisms for Complex Scenes Ship Detection in SAR Images
Chen, Peng
Zhou, Hui
Li, Ying
Liu, Peng
Liu, Bingxin
[J]. REMOTE SENSING, 2023, 15 (10)
[6] You Only Look One-level Feature
Chen, Qiang
Wang, Yingming
Yang, Tong
Zhang, Xiangyu
Cheng, Jian
Sun, Jian
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 13034 - 13043
[7] Learning Slimming SAR Ship Object Detector Through Network Pruning and Knowledge Distillation
Chen, Shiqi
Zhan, Ronghui
Wang, Wei
Zhang, Jun
[J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 : 1267 - 1282
[8] Attentional Feature Fusion
Dai, Yimian
Gieseke, Fabian
Oehmcke, Stefan
Wu, Yiquan
Barnard, Kobus
[J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 3559 - 3568
[9] CenterNet: Keypoint Triplets for Object Detection
Duan, Kaiwen
Bai, Song
Xie, Lingxi
Qi, Honggang
Huang, Qingming
Tian, Qi
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6568 - 6577
[10] Farah F., 2022, 2022 7 INT C IM SIGN

← 1 2 3 4 5 6 7 →