Cross-modal Representation Learning with Nonlinear Dimensionality Reduction

被引：0

作者：

Kaya, Semih ^{[1
]}

Vural, Elif ^{[1
]}

机构：

[1] Orta Dogu Tekn Univ, Elektr & Elekt Muhendisligi Bolumu, Ankara, Turkey

来源：

2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU) | 2019年

关键词：

Cross-modal learning; multi-view learning; nonlinear projections;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In many problems in machine learning there exist relations between data collections from different modalities. The purpose of multi-modal learning algorithms is to efficiently use the information present in different modalities when solving multi-modal retrieval problems. In this work, a multi-modal representation learning algorithm is proposed, which is based on nonlinear dimensionality reduction. Compared to linear dimensionality reduction methods, nonlinear methods provide more flexible representations especially when there is high discrepancy between the structures of different modalities. In this work, we propose to align different modalities by mapping same-class training data from different modalities to nearby coordinates, while we also learn a Lipschitz-continuous interpolation function that generalizes the learnt representation to the whole data space. Experiments in image-text retrieval applications show that the proposed method yields high performance when compared to multi-modal learning methods in the literature.

引用

页数：4

共 50 条

[21] Cross-modal interaction between visual and olfactory learning in Apis cerana
Li-Zhen Zhang
Shao-Wu Zhang
Zi-Long Wang
Wei-Yu Yan
Zhi-Jiang Zeng
Journal of Comparative Physiology A, 2014, 200 : 899 - 909
[22] Oracle Character Recognition Based on Cross-Modal Deep Metric Learning
Zhang Y.-K.
Zhang H.
Liu Y.-G.
Liu C.-L.
Zidonghua Xuebao/Acta Automatica Sinica, 2021, 47 (04): : 791 - 800
[23] The Visual Advantage Effect in Comparing Uni-Modal and Cross-Modal Probabilistic Category Learning
Sun, Xunwei
Fu, Qiufang
JOURNAL OF INTELLIGENCE, 2023, 11 (12)
[24] Self-Supervised Intra-Modal and Cross-Modal Contrastive Learning for Point Cloud Understanding
Wu, Yue
Liu, Jiaming
Gong, Maoguo
Gong, Peiran
Fan, Xiaolong
Qin, A. K.
Miao, Qiguang
Ma, Wenping
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 1626 - 1638
[25] Adversarial-Metric Learning for Audio-Visual Cross-Modal Matching
Zheng, Aihua
Hu, Menglan
Jiang, Bo
Huang, Yan
Yan, Yan
Luo, Bin
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 338 - 351
[26] Cross-Modal Learning for Anomaly Detection in Complex Industrial Process: Methodology and Benchmark
Wu, Gaochang
Zhang, Yapeng
Deng, Lan
Zhang, Jingxin
Chai, Tianyou
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (03) : 2632 - 2645
[27] Lifelong Visual-Tactile Cross-Modal Learning for Robotic Material Perception
Zheng, Wendong
Liu, Huaping
Sun, Fuchun
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (03) : 1192 - 1203
[28] Multispectral Object Detection via Cross-Modal Conflict-Aware Learning
He, Xiao
Tang, Chang
Zou, Xin
Zhang, Wei
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 1465 - 1474
[29] Facial action unit detection with emotion consistency: a cross-modal learning approach
Song, Wenyu
Liu, Dongxin
An, Gaoyun
Duan, Yun
Wang, Laifu
MULTIMEDIA SYSTEMS, 2024, 30 (06)
[30] Uncertainty-Guided Cross-Modal Learning for Robust Multispectral Pedestrian Detection
Kim, Jung Uk
Park, Sungjune
Ro, Yong Man
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (03) : 1510 - 1523

← 1 2 3 4 5 →