Cross-modal Representation Learning with Nonlinear Dimensionality Reduction

被引：0

作者：

Kaya, Semih ^{[1
]}

Vural, Elif ^{[1
]}

机构：

[1] Orta Dogu Tekn Univ, Elektr & Elekt Muhendisligi Bolumu, Ankara, Turkey

来源：

2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU) | 2019年

关键词：

Cross-modal learning; multi-view learning; nonlinear projections;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In many problems in machine learning there exist relations between data collections from different modalities. The purpose of multi-modal learning algorithms is to efficiently use the information present in different modalities when solving multi-modal retrieval problems. In this work, a multi-modal representation learning algorithm is proposed, which is based on nonlinear dimensionality reduction. Compared to linear dimensionality reduction methods, nonlinear methods provide more flexible representations especially when there is high discrepancy between the structures of different modalities. In this work, we propose to align different modalities by mapping same-class training data from different modalities to nearby coordinates, while we also learn a Lipschitz-continuous interpolation function that generalizes the learnt representation to the whole data space. Experiments in image-text retrieval applications show that the proposed method yields high performance when compared to multi-modal learning methods in the literature.

引用

页数：4

共 50 条

[31] Vulnerability vs. Reliability: Disentangled Adversarial Examples for Cross-Modal Learning
Li, Chao
Tang, Haoteng
Deng, Cheng
Zhan, Liang
Liu, Wei
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 421 - 429
[32] Lack of Cross-Modal Effects in Dual-Modality Implicit Statistical Learning
Li, Xiujun
Zhao, Xudong
Shi, Wendian
Lu, Yang
Conway, Christopher M.
FRONTIERS IN PSYCHOLOGY, 2018, 9
[33] Hierarchical Cross-Modal Graph Consistency Learning for Video-Text Retrieval
Jin, Weike
Zhao, Zhou
Zhang, Pengcheng
Zhu, Jieming
He, Xiuqiang
Zhuang, Yueting
SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1114 - 1124
[34] Cross-Modal Learning Based Flexible Bimodal Biometric Authentication With Template Protection
Jiang, Qi
Zhao, Guichuan
Ma, Xindi
Li, Meng
Tian, Youliang
Li, Xinghua
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 3593 - 3607
[35] Deep Latent Space Learning for Cross-modal Mapping of Audio and Visual Signals
Nawaz, Shah
Janjua, Muhammad Kamran
Gallo, Ignazio
Mahmood, Arif
Calefati, Alessandro
2019 DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2019, : 83 - 89
[36] Cross-modal learning with multi-modal model for video action recognition based on adaptive weight training
Zhou, Qingguo
Hou, Yufeng
Zhou, Rui
Li, Yan
Wang, Jinqiang
Wu, Zhen
Li, Hung-Wei
Weng, Tien-Hsiung
CONNECTION SCIENCE, 2024, 36 (01)
[37] Cross-Modal Federated Human Activity Recognition
Yang, Xiaoshan
Xiong, Baochen
Huang, Yi
Xu, Changsheng
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (08) : 5345 - 5361
[38] Audio-to-Image Cross-Modal Generation
Zelaszczyk, Maciej
Mandziuk, Jacek
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[39] Cross-Modal Learning via Adversarial Loss and Covariate Shift for Enhanced Liver Segmentation
Ozkan, Savas
Selver, M. Alper
Baydar, Bora
Kavur, Ali Emre
Candemir, Cemre
Akar, Gozde Bozdagi
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (04): : 2723 - 2735
[40] PointCMC: cross-modal multi-scale correspondences learning for point cloud understanding
Zhou, Honggu
Peng, Xiaogang
Luo, Yikai
Wu, Zizhao
MULTIMEDIA SYSTEMS, 2024, 30 (03)

← 1 2 3 4 5 →