Cross-modal Representation Learning with Nonlinear Dimensionality Reduction

被引：0

作者：

Kaya, Semih ^{[1
]}

Vural, Elif ^{[1
]}

机构：

[1] Orta Dogu Tekn Univ, Elektr & Elekt Muhendisligi Bolumu, Ankara, Turkey

来源：

2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU) | 2019年

关键词：

Cross-modal learning; multi-view learning; nonlinear projections;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In many problems in machine learning there exist relations between data collections from different modalities. The purpose of multi-modal learning algorithms is to efficiently use the information present in different modalities when solving multi-modal retrieval problems. In this work, a multi-modal representation learning algorithm is proposed, which is based on nonlinear dimensionality reduction. Compared to linear dimensionality reduction methods, nonlinear methods provide more flexible representations especially when there is high discrepancy between the structures of different modalities. In this work, we propose to align different modalities by mapping same-class training data from different modalities to nearby coordinates, while we also learn a Lipschitz-continuous interpolation function that generalizes the learnt representation to the whole data space. Experiments in image-text retrieval applications show that the proposed method yields high performance when compared to multi-modal learning methods in the literature.

引用

页数：4

共 50 条

[41] Cross-modal learning using privileged information for long-tailed image classification
Li, Xiangxian
Zheng, Yuze
Ma, Haokai
Qi, Zhuang
Meng, Xiangxu
Meng, Lei
COMPUTATIONAL VISUAL MEDIA, 2024, 10 (05) : 981 - 992
[42] Through-Wall Human Pose Reconstruction Based on Cross-Modal Learning and Self-Supervised Learning
Zheng, Zhijie
Zhang, Diankun
Liang, Xiao
Liu, Xiaojun
Fang, Guangyou
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[43] Cross-Modal Information-Guided Network Using Contrastive Learning for Point Cloud Registration
Xie, Yifan
Zhu, Jihua
Li, Shiqi
Shi, Pengcheng
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (01): : 103 - 110
[44] Deep Understanding of Cooking Procedure for Cross-modal Recipe Retrieval
Chen, Jing-Jing
Ngo, Chong-Wah
Feng, Fu-Li
Chua, Tat-Seng
PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 1020 - 1028
[45] MULTI-VIEW FUSION THROUGH CROSS-MODAL RETRIEVAL
Cui, Limeng
Chen, Zhensong
Zhang, Jiawei
He, Lifang
Shi, Yong
Yu, Philip S.
2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 1977 - 1981
[46] Silicon-based inorganic-organic hybrid optoelectronic synaptic devices simulating cross-modal learning
Li, Yayao
Wang, Yue
Yin, Lei
Huang, Wen
Peng, Wenbing
Zhu, Yiyuc
Wang, Kun
Yang, Deren
Pi, Xiaodong
SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (06)
[47] Silicon-based inorganic-organic hybrid optoelectronic synaptic devices simulating cross-modal learning
Yayao LI
Yue WANG
Lei YIN
Wen HUANG
Wenbing PENG
Yiyue ZHU
Kun WANG
Deren YANG
Xiaodong PI
ScienceChina(InformationSciences), 2021, 64 (06) : 188 - 195
[48] Silicon-based inorganic-organic hybrid optoelectronic synaptic devices simulating cross-modal learning
Yayao Li
Yue Wang
Lei Yin
Wen Huang
Wenbing Peng
Yiyue Zhu
Kun Wang
Deren Yang
Xiaodong Pi
Science China Information Sciences, 2021, 64
[49] Img2Acoustic: A Cross-Modal Gesture Recognition Method Based on Few-Shot Learning
Zou, Yongpan
Weng, Jianhao
Kuang, Wenting
Jiao, Yang
Leung, Victor C. M.
Wu, Kaishun
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2025, 24 (03) : 1496 - 1512
[50] DISENTANGLED SPEECH EMBEDDINGS USING CROSS-MODAL SELF-SUPERVISION
Nagrani, Arsha
Chung, Joon Son
Albanie, Samuel
Zisserman, Andrew
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6829 - 6833

← 1 2 3 4 5 →