Zero-Shot Cross-Modal Retrieval for Remote Sensing Images With Minimal Supervision

被引:15
|
作者
Chaudhuri, Ushasi [1 ]
Bose, Rupak [1 ]
Banerjee, Biplab [1 ]
Bhattacharya, Avik [1 ]
Datcu, Mihai [2 ]
机构
[1] Indian Inst Technol, Ctr Studies Resources Engn CSRE, Mumbai 400076, Maharashtra, India
[2] German Aerosp Ctr DLR, D-82234 Wessling, Germany
关键词
Training; Protocols; Task analysis; Semantics; Sensors; Image retrieval; Training data; Cross-modal retrieval (CMR); few-shot learning (FSL); multispectral; remote sensing (RS); synthetic aperture radar (SAR); zero-shot learning (ZSL); CLASSIFICATION; NETWORK; MODEL;
D O I
10.1109/TGRS.2022.3196307
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
The performance of a deep-learning-based model primarily relies on the diversity and size of the training dataset. However, obtaining such a large amount of labeled data for practical remote sensing (RS) applications is expensive and labor-intensive. Training protocols have been previously proposed for few-shot learning (FSL) and zero-shot learning (ZSL). However, FSL is not compatible with handling unobserved class data at the inference phase, while ZSL requires many training samples of the seen classes. In this work, we propose a novel training protocol for image retrieval and name it as label-deficit zero-shot learning (LDZSL). We use this novel LDZSL training protocol for the challenging task of cross-sensor data retrieval in RS. This protocol uses very few labeled data samples of the seen classes during training and interprets unobserved class data samples at the inference phase. This strategy is critical as some data modalities are hard to annotate without domain experts. This work proposes a novel bilevel Siamese network to perform the LDZSL cross-sensor retrieval of multispectral and synthetic aperture radar (SAR) images. We use the available georeferenced SAR and multispectral data to domain align the embedding features of the two modalities. We experimentally demonstrate the proposed model's efficacy using the So2Sat dataset compared with the existing state-of-the-art models of the ZSL framework trained under a reduced training set. We also show the generalizability of the proposed model using a sketch-based image retrieval task. Experimental results on the Earth on the Canvas dataset exhibit comparative performance over the literature.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Generalized Zero-Shot Cross-Modal Retrieval
    Dutta, Titir
    Biswas, Soma
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (12) : 5953 - 5962
  • [2] Ternary Adversarial Networks With Self-Supervision for Zero-Shot Cross-Modal Retrieval
    Xu, Xing
    Lu, Huimin
    Song, Jingkuan
    Yang, Yang
    Shen, Heng Tao
    Li, Xuelong
    IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (06) : 2400 - 2413
  • [3] Cross-modal Zero-shot Hashing
    Liu, Xuanwu
    Li, Zhao
    Wang, Jun
    Yu, Guoxian
    Domeniconi, Carlotta
    Zhang, Xiangliang
    2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 449 - 458
  • [4] CHOP: An orthogonal hashing method for zero-shot cross-modal retrieval
    Yuan, Xu
    Wang, Guangze
    Chen, Zhikui
    Zhong, Fangming
    PATTERN RECOGNITION LETTERS, 2021, 145 : 247 - 253
  • [5] A Simplified Framework for Zero-shot Cross-Modal Sketch Data Retrieval
    Chaudhuri, Ushasi
    Banerjee, Biplab
    Bhattacharya, Avik
    Datcu, Mihai
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 699 - 706
  • [6] Multimodal Disentanglement Variational AutoEncoders for Zero-Shot Cross-Modal Retrieval
    Tian, Jialin
    Wang, Kai
    Xu, Xing
    Cao, Zuo
    Shen, Fumin
    Shen, Heng Tao
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 960 - 969
  • [7] Correlated Features Synthesis and Alignment for Zero-shot Cross-modal Retrieval
    Xu, Xing
    Lin, Kaiyi
    Lu, Huimin
    Gao, Lianli
    Shen, Heng Tao
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1419 - 1428
  • [8] Mining on Heterogeneous Manifolds for Zero-Shot Cross-Modal Image Retrieval
    Yang, Fan
    Wang, Zheng
    Xiao, Jing
    Satoh, Shin'chi
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 12589 - 12596
  • [9] Discrete asymmetric zero-shot hashing with application to cross-modal retrieval
    Shu, Zhenqiu
    Yong, Kailing
    Yu, Jun
    Gao, Shengxiang
    Mao, Cunli
    Yu, Zhengtao
    NEUROCOMPUTING, 2022, 511 : 366 - 379
  • [10] Learning Cross-Aligned Latent Embeddings for Zero-Shot Cross-Modal Retrieval
    Lin, Kaiyi
    Xu, Xing
    Gao, Lianli
    Wang, Zheng
    Shen, Heng Tao
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11515 - 11522