Learning sufficient scene representation for unsupervised cross-modal retrieval

被引:6
|
作者
Luo, Jieting [1 ]
Wo, Yan [1 ]
Wu, Bicheng [1 ]
Han, Guoqiang [1 ]
机构
[1] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou, Peoples R China
关键词
Unsupervised cross-modal retrieval; Common representation; Statistical manifold; Gaussian Mixture Model; Geodesic distance;
D O I
10.1016/j.neucom.2021.07.078
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a novel unsupervised Cross-Modal retrieval method via Sufficient Scene Representation (CMSSR) is proposed. Distinguished from the existing methods which mainly focus on simultaneously preserving the mutually-constrained intra-and inter-modal similarity relation, CMSSR considers data of different modalities as the descriptions of a scene from different views and accordingly integrates information of different modalities to learn a complete common representation containing sufficient information of the corresponding scene. To obtain such common representation, Gaussian Mixture Model (GMM) is firstly utilized to generate statistic representation of each uni-modal data, while the uni-modal spaces are accordingly abstracted as uni-modal statistical manifolds. In addition, the common space is assumed to be a high-dimensional statistical manifold with different uni-modal statistical man-ifolds as its sub-manifolds. In order to generate sufficient scene representation from uni-modal data, a representation completion strategy based on logistic regression is proposed to effectively complete the missing representation of another modality. Then, the similarity between different multi-modal data can be more accurately reflected by the distance metric in common statistical manifold. Based on the dis-tance metric in common statistical manifold, Iterative Quantization is utilized to further generate binary code for fast cross-modal retrieval. Extensive experiments on three standard benchmark datasets fully demonstrate the superiority of CMSSR compared with several state-of-the-art methods. (c) 2021 Published by Elsevier B.V.
引用
收藏
页码:404 / 418
页数:15
相关论文
共 50 条
  • [21] Unsupervised Deep Cross-Modal Hashing by Knowledge Distillation for Large-scale Cross-modal Retrieval
    Li, Mingyong
    Wang, Hongya
    PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 183 - 191
  • [22] Learning Disentangled Representation for Cross-Modal Retrieval with Deep Mutual Information Estimation
    Guo, Weikuo
    Huang, Huaibo
    Kong, Xiangwei
    He, Ran
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 1712 - 1720
  • [23] Hybrid DAER Based Cross-Modal Retrieval Exploiting Deep Representation Learning
    Huang, Zhao
    Hu, Haowu
    Su, Miao
    ENTROPY, 2023, 25 (08)
  • [24] Adversarial Learning-Based Semantic Correlation Representation for Cross-Modal Retrieval
    Zhu, Lei
    Song, Jiayu
    Zhu, Xiaofeng
    Zhang, Chengyuan
    Zhang, Shichao
    Yuan, Xinpan
    IEEE MULTIMEDIA, 2020, 27 (04) : 79 - 90
  • [25] Probability Distribution Representation Learning for Image-Text Cross-Modal Retrieval
    Yang C.
    Liu L.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2022, 34 (05): : 751 - 759
  • [26] Cross-modal hashing retrieval with compatible triplet representation
    Hao, Zhifeng
    Jin, Yaochu
    Yan, Xueming
    Wang, Chuyue
    Yang, Shangshang
    Ge, Hong
    NEUROCOMPUTING, 2024, 602
  • [27] Representation separation adversarial networks for cross-modal retrieval
    Deng, Jiaxin
    Ou, Weihua
    Gou, Jianping
    Song, Heping
    Wang, Anzhi
    Xu, Xing
    WIRELESS NETWORKS, 2024, 30 (05) : 3469 - 3481
  • [28] Deep Unsupervised Momentum Contrastive Hashing for Cross-modal Retrieval
    Lu, Kangkang
    Yu, Yanhua
    Liang, Meiyu
    Zhang, Min
    Cao, Xiaowen
    Zhao, Zehua
    Yin, Mengran
    Xue, Zhe
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 126 - 131
  • [29] UNSUPERVISED CONTRASTIVE HASHING FOR CROSS-MODAL RETRIEVAL IN REMOTE SENSING
    Mikriukov, Georgii
    Ravanbakhsh, Mahdyar
    Demir, Begum
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4463 - 4467
  • [30] Unsupervised Deep Imputed Hashing for Partial Cross-modal Retrieval
    Chen, Dong
    Cheng, Miaomiao
    Min, Chen
    Jing, Liping
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,