Cross-modal Representation Learning with Nonlinear Dimensionality Reduction

被引:0
|
作者
Kaya, Semih [1 ]
Vural, Elif [1 ]
机构
[1] Orta Dogu Tekn Univ, Elektr & Elekt Muhendisligi Bolumu, Ankara, Turkey
来源
2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU) | 2019年
关键词
Cross-modal learning; multi-view learning; nonlinear projections;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In many problems in machine learning there exist relations between data collections from different modalities. The purpose of multi-modal learning algorithms is to efficiently use the information present in different modalities when solving multi-modal retrieval problems. In this work, a multi-modal representation learning algorithm is proposed, which is based on nonlinear dimensionality reduction. Compared to linear dimensionality reduction methods, nonlinear methods provide more flexible representations especially when there is high discrepancy between the structures of different modalities. In this work, we propose to align different modalities by mapping same-class training data from different modalities to nearby coordinates, while we also learn a Lipschitz-continuous interpolation function that generalizes the learnt representation to the whole data space. Experiments in image-text retrieval applications show that the proposed method yields high performance when compared to multi-modal learning methods in the literature.
引用
收藏
页数:4
相关论文
共 50 条
  • [21] Cross-modal interaction between visual and olfactory learning in Apis cerana
    Li-Zhen Zhang
    Shao-Wu Zhang
    Zi-Long Wang
    Wei-Yu Yan
    Zhi-Jiang Zeng
    Journal of Comparative Physiology A, 2014, 200 : 899 - 909
  • [22] Oracle Character Recognition Based on Cross-Modal Deep Metric Learning
    Zhang Y.-K.
    Zhang H.
    Liu Y.-G.
    Liu C.-L.
    Zidonghua Xuebao/Acta Automatica Sinica, 2021, 47 (04): : 791 - 800
  • [23] The Visual Advantage Effect in Comparing Uni-Modal and Cross-Modal Probabilistic Category Learning
    Sun, Xunwei
    Fu, Qiufang
    JOURNAL OF INTELLIGENCE, 2023, 11 (12)
  • [24] Self-Supervised Intra-Modal and Cross-Modal Contrastive Learning for Point Cloud Understanding
    Wu, Yue
    Liu, Jiaming
    Gong, Maoguo
    Gong, Peiran
    Fan, Xiaolong
    Qin, A. K.
    Miao, Qiguang
    Ma, Wenping
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 1626 - 1638
  • [25] Adversarial-Metric Learning for Audio-Visual Cross-Modal Matching
    Zheng, Aihua
    Hu, Menglan
    Jiang, Bo
    Huang, Yan
    Yan, Yan
    Luo, Bin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 338 - 351
  • [26] Cross-Modal Learning for Anomaly Detection in Complex Industrial Process: Methodology and Benchmark
    Wu, Gaochang
    Zhang, Yapeng
    Deng, Lan
    Zhang, Jingxin
    Chai, Tianyou
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (03) : 2632 - 2645
  • [27] Lifelong Visual-Tactile Cross-Modal Learning for Robotic Material Perception
    Zheng, Wendong
    Liu, Huaping
    Sun, Fuchun
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (03) : 1192 - 1203
  • [28] Multispectral Object Detection via Cross-Modal Conflict-Aware Learning
    He, Xiao
    Tang, Chang
    Zou, Xin
    Zhang, Wei
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 1465 - 1474
  • [29] Facial action unit detection with emotion consistency: a cross-modal learning approach
    Song, Wenyu
    Liu, Dongxin
    An, Gaoyun
    Duan, Yun
    Wang, Laifu
    MULTIMEDIA SYSTEMS, 2024, 30 (06)
  • [30] Uncertainty-Guided Cross-Modal Learning for Robust Multispectral Pedestrian Detection
    Kim, Jung Uk
    Park, Sungjune
    Ro, Yong Man
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (03) : 1510 - 1523