Zero-Shot Cross-Modal Retrieval for Remote Sensing Images With Minimal Supervision

被引:15
|
作者
Chaudhuri, Ushasi [1 ]
Bose, Rupak [1 ]
Banerjee, Biplab [1 ]
Bhattacharya, Avik [1 ]
Datcu, Mihai [2 ]
机构
[1] Indian Inst Technol, Ctr Studies Resources Engn CSRE, Mumbai 400076, Maharashtra, India
[2] German Aerosp Ctr DLR, D-82234 Wessling, Germany
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2022年 / 60卷
关键词
Training; Protocols; Task analysis; Semantics; Sensors; Image retrieval; Training data; Cross-modal retrieval (CMR); few-shot learning (FSL); multispectral; remote sensing (RS); synthetic aperture radar (SAR); zero-shot learning (ZSL); CLASSIFICATION; NETWORK; MODEL;
D O I
10.1109/TGRS.2022.3196307
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
The performance of a deep-learning-based model primarily relies on the diversity and size of the training dataset. However, obtaining such a large amount of labeled data for practical remote sensing (RS) applications is expensive and labor-intensive. Training protocols have been previously proposed for few-shot learning (FSL) and zero-shot learning (ZSL). However, FSL is not compatible with handling unobserved class data at the inference phase, while ZSL requires many training samples of the seen classes. In this work, we propose a novel training protocol for image retrieval and name it as label-deficit zero-shot learning (LDZSL). We use this novel LDZSL training protocol for the challenging task of cross-sensor data retrieval in RS. This protocol uses very few labeled data samples of the seen classes during training and interprets unobserved class data samples at the inference phase. This strategy is critical as some data modalities are hard to annotate without domain experts. This work proposes a novel bilevel Siamese network to perform the LDZSL cross-sensor retrieval of multispectral and synthetic aperture radar (SAR) images. We use the available georeferenced SAR and multispectral data to domain align the embedding features of the two modalities. We experimentally demonstrate the proposed model's efficacy using the So2Sat dataset compared with the existing state-of-the-art models of the ZSL framework trained under a reduced training set. We also show the generalizability of the proposed model using a sketch-based image retrieval task. Experimental results on the Earth on the Canvas dataset exhibit comparative performance over the literature.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Ternary Adversarial Networks With Self-Supervision for Zero-Shot Cross-Modal Retrieval
    Xu, Xing
    Lu, Huimin
    Song, Jingkuan
    Yang, Yang
    Shen, Heng Tao
    Li, Xuelong
    IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (06) : 2400 - 2413
  • [2] Progressive Cross-Modal Semantic Network for Zero-Shot Sketch-Based Image Retrieval
    Deng, Cheng
    Xu, Xinxun
    Wang, Hao
    Yang, Muli
    Tao, Dacheng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 8892 - 8902
  • [3] A Zero-Shot Sketch-Based Intermodal Object Retrieval Scheme for Remote Sensing Images
    Chaudhuri, Ushasi
    Banerjee, Biplab
    Bhattacharya, Avik
    Datcu, Mihai
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [4] Learning Deep Cross-Modal Embedding Networks for Zero-Shot Remote Sensing Image Scene Classification
    Li, Yansheng
    Zhu, Zhihui
    Yu, Jin-Gang
    Zhang, Yongjun
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (12): : 10590 - 10603
  • [5] A Cross-Modal Alignment for Zero-Shot Image Classification
    Wu, Lu
    Wu, Chenyu
    Guo, Han
    Zhao, Zhihao
    IEEE ACCESS, 2023, 11 : 9067 - 9073
  • [6] Remote Sensing Cross-Modal Retrieval by Deep Image-Voice Hashing
    Zhang, Yichao
    Zheng, Xiangtao
    Lu, Xiaoqiang
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 : 9327 - 9338
  • [7] Mining Contrastive Relations Between Cross-Modal Features for Zero-Shot Remote Sensing Image Scene Classification
    Liu, Chun
    Ma, Suqiang
    Li, Zheng
    Yang, Wei
    Han, Zhigang
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21
  • [8] Integrating Multisubspace Joint Learning With Multilevel Guidance for Cross-Modal Retrieval of Remote Sensing Images
    Chen, Yaxiong
    Huang, Jirui
    Xiong, Shengwu
    Lu, Xiaoqiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 17
  • [9] Consistency Center-Based Deep Cross-Modal Hashing for Multisource Remote Sensing Image Retrieval
    Sun, Yuxi
    Ye, Yunming
    Kang, Jian
    Fernandez-Beltran, Ruben
    Li, Xutao
    Xiong, Zhenyu
    Huang, Xu
    Plaza, Antonio
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [10] Deep Cross-Modal ImageVoice Retrieval in Remote Sensing
    Chen, Yaxiong
    Lu, Xiaoqiang
    Wang, Shuai
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (10): : 7049 - 7061