Zero-Shot Cross-Modal Retrieval for Remote Sensing Images With Minimal Supervision

被引:15
作者
Chaudhuri, Ushasi [1 ]
Bose, Rupak [1 ]
Banerjee, Biplab [1 ]
Bhattacharya, Avik [1 ]
Datcu, Mihai [2 ]
机构
[1] Indian Inst Technol, Ctr Studies Resources Engn CSRE, Mumbai 400076, Maharashtra, India
[2] German Aerosp Ctr DLR, D-82234 Wessling, Germany
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2022年 / 60卷
关键词
Training; Protocols; Task analysis; Semantics; Sensors; Image retrieval; Training data; Cross-modal retrieval (CMR); few-shot learning (FSL); multispectral; remote sensing (RS); synthetic aperture radar (SAR); zero-shot learning (ZSL); CLASSIFICATION; NETWORK; MODEL;
D O I
10.1109/TGRS.2022.3196307
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
The performance of a deep-learning-based model primarily relies on the diversity and size of the training dataset. However, obtaining such a large amount of labeled data for practical remote sensing (RS) applications is expensive and labor-intensive. Training protocols have been previously proposed for few-shot learning (FSL) and zero-shot learning (ZSL). However, FSL is not compatible with handling unobserved class data at the inference phase, while ZSL requires many training samples of the seen classes. In this work, we propose a novel training protocol for image retrieval and name it as label-deficit zero-shot learning (LDZSL). We use this novel LDZSL training protocol for the challenging task of cross-sensor data retrieval in RS. This protocol uses very few labeled data samples of the seen classes during training and interprets unobserved class data samples at the inference phase. This strategy is critical as some data modalities are hard to annotate without domain experts. This work proposes a novel bilevel Siamese network to perform the LDZSL cross-sensor retrieval of multispectral and synthetic aperture radar (SAR) images. We use the available georeferenced SAR and multispectral data to domain align the embedding features of the two modalities. We experimentally demonstrate the proposed model's efficacy using the So2Sat dataset compared with the existing state-of-the-art models of the ZSL framework trained under a reduced training set. We also show the generalizability of the proposed model using a sketch-based image retrieval task. Experimental results on the Earth on the Canvas dataset exhibit comparative performance over the literature.
引用
收藏
页数:15
相关论文
共 73 条
[51]  
Shi X., 2019, arXiv
[52]   Natural image statistics and neural representation [J].
Simoncelli, EP ;
Olshausen, BA .
ANNUAL REVIEW OF NEUROSCIENCE, 2001, 24 :1193-1216
[53]  
Simonyan K, 2015, Arxiv, DOI [arXiv:1409.1556, DOI 10.48550/ARXIV.1409.1556]
[54]   EM Simulation-Aided Zero-Shot Learning for SAR Automatic Target Recognition [J].
Song, Qian ;
Chen, Hui ;
Xu, Feng ;
Cui, Tie Jun .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2020, 17 (06) :1092-1096
[55]   Fine-Grained Object Recognition and Zero-Shot Learning in Remote Sensing Imagery [J].
Sumbul, Gencer ;
Cinbis, Ramazan Gokberk ;
Aksoy, Selim .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2018, 56 (02) :770-779
[56]   Multisensor Fusion and Explicit Semantic Preserving-Based Deep Hashing for Cross-Modal Remote Sensing Image Retrieval [J].
Sun, Yuxi ;
Feng, Shanshan ;
Ye, Yunming ;
Li, Xutao ;
Kang, Jian ;
Huang, Zhichao ;
Luo, Chuyao .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[57]   Incremental Learning for Semantic Segmentation of Large-Scale Remote Sensing Data [J].
Tasar, Onur ;
Tarabalka, Yuliya ;
Alliez, Pierre .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2019, 12 (09) :3524-3537
[58]  
Vinyals Oriol, MATCHING NETWORKS ON
[59]   Label Consistent Matrix Factorization Hashing for Large-Scale Cross-Modal Similarity Search [J].
Wang, Di ;
Gao, Xinbo ;
Wang, Xiumei ;
He, Lihuo .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (10) :2466-2479
[60]  
Wang K. Y., 2016, arXiv