Zero-Shot Cross-Modal Retrieval for Remote Sensing Images With Minimal Supervision

被引:15
|
作者
Chaudhuri, Ushasi [1 ]
Bose, Rupak [1 ]
Banerjee, Biplab [1 ]
Bhattacharya, Avik [1 ]
Datcu, Mihai [2 ]
机构
[1] Indian Inst Technol, Ctr Studies Resources Engn CSRE, Mumbai 400076, Maharashtra, India
[2] German Aerosp Ctr DLR, D-82234 Wessling, Germany
关键词
Training; Protocols; Task analysis; Semantics; Sensors; Image retrieval; Training data; Cross-modal retrieval (CMR); few-shot learning (FSL); multispectral; remote sensing (RS); synthetic aperture radar (SAR); zero-shot learning (ZSL); CLASSIFICATION; NETWORK; MODEL;
D O I
10.1109/TGRS.2022.3196307
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
The performance of a deep-learning-based model primarily relies on the diversity and size of the training dataset. However, obtaining such a large amount of labeled data for practical remote sensing (RS) applications is expensive and labor-intensive. Training protocols have been previously proposed for few-shot learning (FSL) and zero-shot learning (ZSL). However, FSL is not compatible with handling unobserved class data at the inference phase, while ZSL requires many training samples of the seen classes. In this work, we propose a novel training protocol for image retrieval and name it as label-deficit zero-shot learning (LDZSL). We use this novel LDZSL training protocol for the challenging task of cross-sensor data retrieval in RS. This protocol uses very few labeled data samples of the seen classes during training and interprets unobserved class data samples at the inference phase. This strategy is critical as some data modalities are hard to annotate without domain experts. This work proposes a novel bilevel Siamese network to perform the LDZSL cross-sensor retrieval of multispectral and synthetic aperture radar (SAR) images. We use the available georeferenced SAR and multispectral data to domain align the embedding features of the two modalities. We experimentally demonstrate the proposed model's efficacy using the So2Sat dataset compared with the existing state-of-the-art models of the ZSL framework trained under a reduced training set. We also show the generalizability of the proposed model using a sketch-based image retrieval task. Experimental results on the Earth on the Canvas dataset exhibit comparative performance over the literature.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Cross-modal Representation Learning for Zero-shot Action Recognition
    Lin, Chung-Ching
    Lin, Kevin
    Wang, Lijuan
    Liu, Zicheng
    Li, Linjie
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19946 - 19956
  • [22] Manifold regularized cross-modal embedding for zero-shot learning
    Ji, Zhong
    Yu, Yunlong
    Pang, Yanwei
    Guo, Jichang
    Zhang, Zhongfei
    INFORMATION SCIENCES, 2017, 378 : 48 - 58
  • [23] Cross-modal propagation network for generalized zero-shot learning
    Guo, Ting
    Liang, Jianqing
    Liang, Jiye
    Xie, Guo-Sen
    PATTERN RECOGNITION LETTERS, 2022, 159 : 125 - 131
  • [24] Two-stage zero-shot sparse hashing with missing labels for cross-modal retrieval
    Yong, Kailing
    Shu, Zhenqiu
    Wang, Hongbin
    Yu, Zhengtao
    PATTERN RECOGNITION, 2024, 155
  • [25] Cross-modal Self-distillation for Zero-shot Sketch-based Image Retrieval
    Tian J.-L.
    Xu X.
    Shen F.-M.
    Shen H.-T.
    Ruan Jian Xue Bao/Journal of Software, 2022, 33 (09):
  • [26] Progressive Cross-Modal Semantic Network for Zero-Shot Sketch-Based Image Retrieval
    Deng, Cheng
    Xu, Xinxun
    Wang, Hao
    Yang, Muli
    Tao, Dacheng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 8892 - 8902
  • [27] Deep cross-modal discriminant adversarial learning for zero-shot sketch-based image retrieval
    Jiao, Shichao
    Han, Xie
    Xiong, Fengguang
    Yang, Xiaowen
    Han, Huiyan
    He, Ligang
    Kuang, Liqun
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (16): : 13469 - 13483
  • [28] Deep Cross-Modal ImageVoice Retrieval in Remote Sensing
    Chen, Yaxiong
    Lu, Xiaoqiang
    Wang, Shuai
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (10): : 7049 - 7061
  • [29] Deep cross-modal discriminant adversarial learning for zero-shot sketch-based image retrieval
    Shichao Jiao
    Xie Han
    Fengguang Xiong
    Xiaowen Yang
    Huiyan Han
    Ligang He
    Liqun Kuang
    Neural Computing and Applications, 2022, 34 : 13469 - 13483
  • [30] Learning Aligned Cross-Modal Representation for Generalized Zero-Shot Classification
    Fang, Zhiyu
    Zhu, Xiaobin
    Yang, Chun
    Han, Zheng
    Qin, Jingyan
    Yin, Xu-Cheng
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6605 - 6613