Leveraging Visual Language Model and Generative Diffusion Model for Zero-Shot SAR Target Recognition

被引:2
作者
Wang, Junyu [1 ]
Sun, Hao [1 ]
Tang, Tao [1 ]
Sun, Yuli [1 ]
He, Qishan [1 ]
Lei, Lin [1 ]
Ji, Kefeng [1 ]
机构
[1] Natl Univ Def Technol, Coll Elect Sci & Technol, Changsha 410073, Peoples R China
基金
中国国家自然科学基金;
关键词
SAR simulation; target recognition; visual language model; generative diffusion model; domain adaption;
D O I
10.3390/rs16162927
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Simulated data play an important role in SAR target recognition, particularly under zero-shot learning (ZSL) conditions caused by the lack of training samples. The traditional SAR simulation method is based on manually constructing target 3D models for electromagnetic simulation, which is costly and limited by the target's prior knowledge base. Also, the unavoidable discrepancy between simulated SAR and measured SAR makes the traditional simulation method more limited for target recognition. This paper proposes an innovative SAR simulation method based on a visual language model and generative diffusion model by extracting target semantic information from optical remote sensing images and transforming it into a 3D model for SAR simulation to address the challenge of SAR target recognition under ZSL conditions. Additionally, to reduce the domain shift between the simulated domain and the measured domain, we propose a domain adaptation method based on dynamic weight domain loss and classification loss. The effectiveness of semantic information-based 3D models has been validated on the MSTAR dataset and the feasibility of the proposed framework has been validated on the self-built civilian vehicle dataset. The experimental results demonstrate that the first proposed SAR simulation method based on a visual language model and generative diffusion model can effectively improve target recognition performance under ZSL conditions.
引用
收藏
页数:22
相关论文
共 60 条
  • [1] RAYSAR-3D SAR SIMULATOR: NOW OPEN SOURCE
    Auer, Stefan
    Bamler, Richard
    Reinartz, Peter
    [J]. 2016 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2016, : 6730 - 6733
  • [2] Hybrid GPU-Based Single- and Double-Bounce SAR Simulation
    Balz, Timo
    Stilla, Uwe
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2009, 47 (10): : 3519 - 3529
  • [3] A Survey on Generative Diffusion Models
    Cao, Hanqun
    Tan, Cheng
    Gao, Zhangyang
    Xu, Yilun
    Chen, Guangyong
    Heng, Pheng-Ann
    Li, Stan Z.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (07) : 2814 - 2830
  • [4] Emerging Properties in Self-Supervised Vision Transformers
    Caron, Mathilde
    Touvron, Hugo
    Misra, Ishan
    Jegou, Herve
    Mairal, Julien
    Bojanowski, Piotr
    Joulin, Armand
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9630 - 9640
  • [5] Efficient Geometry-aware 3D Generative Adversarial Networks
    Chan, Eric R.
    Lin, Connor Z.
    Chan, Matthew A.
    Nagano, Koki
    Pan, Boxiao
    de Mello, Shalini
    Gallo, Orazio
    Guibas, Leonidas
    Tremblay, Jonathan
    Khamis, Sameh
    Karras, Tero
    Wetzstein, Gordon
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 16102 - 16112
  • [6] Target Classification Using the Deep Convolutional Networks for SAR Images
    Chen, Sizhe
    Wang, Haipeng
    Xu, Feng
    Jin, Ya-Qiu
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2016, 54 (08): : 4806 - 4817
  • [7] LFSMIM: A Low-Frequency Spectral Masked Image Modeling Method for Hyperspectral Image Classification
    Chen, Yuhan
    Yan, Qingyun
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
  • [8] commons.wikimedia, Wikipedia Contributors T-72 Tank at CFB Borden-Wikimedia Commons
  • [9] Towards Discriminability and Diversity: Batch Nuclear-norm Maximization under Label Insufficient Situations
    Cui, Shuhao
    Wang, Shuhui
    Zhuo, Junbao
    Li, Liang
    Huang, Qingming
    Tian, Qi
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 3940 - 3949
  • [10] degaard N., 2016, P 2016 IEEE RAD C RA, P1