Cross-modal Representation Learning with Nonlinear Dimensionality Reduction

被引：0

作者：

Kaya, Semih ^{[1
]}

Vural, Elif ^{[1
]}

机构：

[1] Orta Dogu Tekn Univ, Elektr & Elekt Muhendisligi Bolumu, Ankara, Turkey

来源：

2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU) | 2019年

关键词：

Cross-modal learning; multi-view learning; nonlinear projections;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In many problems in machine learning there exist relations between data collections from different modalities. The purpose of multi-modal learning algorithms is to efficiently use the information present in different modalities when solving multi-modal retrieval problems. In this work, a multi-modal representation learning algorithm is proposed, which is based on nonlinear dimensionality reduction. Compared to linear dimensionality reduction methods, nonlinear methods provide more flexible representations especially when there is high discrepancy between the structures of different modalities. In this work, we propose to align different modalities by mapping same-class training data from different modalities to nearby coordinates, while we also learn a Lipschitz-continuous interpolation function that generalizes the learnt representation to the whole data space. Experiments in image-text retrieval applications show that the proposed method yields high performance when compared to multi-modal learning methods in the literature.

引用

页数：4

共 50 条

[1] MULTI-MODAL LEARNING WITH GENERALIZABLE NONLINEAR DIMENSIONALITY REDUCTION
Kaya, Semih
Vural, Elif
2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 2139 - 2143
[2] CrossFormer: Cross-Modal Representation Learning via Heterogeneous Graph Transformer
Liang, Xiao
Yang, Erkun
Deng, Cheng
Yang, Yanhua
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (12)
[3] Unsupervised Cross-Modal Audio Representation Learning from Unstructured Multilingual Text
Schindler, Alexander
Gordea, Sergiu
Knees, Peter
PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), 2020, : 706 - 713
[4] Trusted 3D self-supervised representation learning with cross-modal settings
Han, Xu
Cheng, Haozhe
Shi, Pengcheng
Zhu, Jihua
MACHINE VISION AND APPLICATIONS, 2024, 35 (04)
[5] CROSS-MODAL REPRESENTATION RECONSTRUCTION FOR ZERO-SHOT CLASSIFICATION
Wang, Yu
Zhao, Shenjie
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 2820 - 2824
[6] Cross-modal Learning and Its Cognitive and Neural Mechanisms
Sun Xun-Wei
Sun Ying
Fu Qiu-Fang
PROGRESS IN BIOCHEMISTRY AND BIOPHYSICS, 2019, 46 (06) : 565 - 577
[7] Editorial: Cross-Modal Learning: Adaptivity, Prediction and Interaction
Zhang, Jianwei
Wermter, Stefan
Sun, Fuchun
Zhang, Changshui
Engel, Andreas K.
Roeder, Brigitte
Fu, Xiaolan
Xue, Gui
FRONTIERS IN NEUROROBOTICS, 2022, 16
[8] Binding and Cross-Modal Learning in Markov Logic Networks
Vrecko, Alen
Skocaj, Danijel
Leonardis, Ales
ADAPTIVE AND NATURAL COMPUTING ALGORITHMS, PT II, 2011, 6594 : 235 - 244
[9] Cross-Modal Graph Contrastive Learning with Cellular Images
Zheng, Shuangjia
Rao, Jiahua
Zhang, Jixian
Zhou, Lianyu
Xie, Jiancong
Cohen, Ethan
Lu, Wei
Li, Chengtao
Yang, Yuedong
ADVANCED SCIENCE, 2024, 11 (32)
[10] A Framework of Cross-Modal Learning for Solving Geometry Problems
Guo, Fucheng
Jian, Pengpeng
Wang, Yanli
Wang, Qingjiang
IEEE TALE2021: IEEE INTERNATIONAL CONFERENCE ON ENGINEERING, TECHNOLOGY AND EDUCATION, 2021, : 506 - 512

← 1 2 3 4 5 →