Cross-modal Representation Learning with Nonlinear Dimensionality Reduction

被引:0
|
作者
Kaya, Semih [1 ]
Vural, Elif [1 ]
机构
[1] Orta Dogu Tekn Univ, Elektr & Elekt Muhendisligi Bolumu, Ankara, Turkey
来源
2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU) | 2019年
关键词
Cross-modal learning; multi-view learning; nonlinear projections;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In many problems in machine learning there exist relations between data collections from different modalities. The purpose of multi-modal learning algorithms is to efficiently use the information present in different modalities when solving multi-modal retrieval problems. In this work, a multi-modal representation learning algorithm is proposed, which is based on nonlinear dimensionality reduction. Compared to linear dimensionality reduction methods, nonlinear methods provide more flexible representations especially when there is high discrepancy between the structures of different modalities. In this work, we propose to align different modalities by mapping same-class training data from different modalities to nearby coordinates, while we also learn a Lipschitz-continuous interpolation function that generalizes the learnt representation to the whole data space. Experiments in image-text retrieval applications show that the proposed method yields high performance when compared to multi-modal learning methods in the literature.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] MULTI-MODAL LEARNING WITH GENERALIZABLE NONLINEAR DIMENSIONALITY REDUCTION
    Kaya, Semih
    Vural, Elif
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 2139 - 2143
  • [2] CrossFormer: Cross-Modal Representation Learning via Heterogeneous Graph Transformer
    Liang, Xiao
    Yang, Erkun
    Deng, Cheng
    Yang, Yanhua
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (12)
  • [3] Unsupervised Cross-Modal Audio Representation Learning from Unstructured Multilingual Text
    Schindler, Alexander
    Gordea, Sergiu
    Knees, Peter
    PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), 2020, : 706 - 713
  • [4] Trusted 3D self-supervised representation learning with cross-modal settings
    Han, Xu
    Cheng, Haozhe
    Shi, Pengcheng
    Zhu, Jihua
    MACHINE VISION AND APPLICATIONS, 2024, 35 (04)
  • [5] CROSS-MODAL REPRESENTATION RECONSTRUCTION FOR ZERO-SHOT CLASSIFICATION
    Wang, Yu
    Zhao, Shenjie
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 2820 - 2824
  • [6] Cross-modal Learning and Its Cognitive and Neural Mechanisms
    Sun Xun-Wei
    Sun Ying
    Fu Qiu-Fang
    PROGRESS IN BIOCHEMISTRY AND BIOPHYSICS, 2019, 46 (06) : 565 - 577
  • [7] Editorial: Cross-Modal Learning: Adaptivity, Prediction and Interaction
    Zhang, Jianwei
    Wermter, Stefan
    Sun, Fuchun
    Zhang, Changshui
    Engel, Andreas K.
    Roeder, Brigitte
    Fu, Xiaolan
    Xue, Gui
    FRONTIERS IN NEUROROBOTICS, 2022, 16
  • [8] Binding and Cross-Modal Learning in Markov Logic Networks
    Vrecko, Alen
    Skocaj, Danijel
    Leonardis, Ales
    ADAPTIVE AND NATURAL COMPUTING ALGORITHMS, PT II, 2011, 6594 : 235 - 244
  • [9] Cross-Modal Graph Contrastive Learning with Cellular Images
    Zheng, Shuangjia
    Rao, Jiahua
    Zhang, Jixian
    Zhou, Lianyu
    Xie, Jiancong
    Cohen, Ethan
    Lu, Wei
    Li, Chengtao
    Yang, Yuedong
    ADVANCED SCIENCE, 2024, 11 (32)
  • [10] A Framework of Cross-Modal Learning for Solving Geometry Problems
    Guo, Fucheng
    Jian, Pengpeng
    Wang, Yanli
    Wang, Qingjiang
    IEEE TALE2021: IEEE INTERNATIONAL CONFERENCE ON ENGINEERING, TECHNOLOGY AND EDUCATION, 2021, : 506 - 512