MULTI-MODAL LEARNING WITH GENERALIZABLE NONLINEAR DIMENSIONALITY REDUCTION

被引：0

作者：

Kaya, Semih ^{[1
]}

Vural, Elif ^{[1
]}

机构：

[1] METU, Dept Elect & Elect Engn, Ankara, Turkey

来源：

2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | 2019年

关键词：

Cross-modal learning; multi-view learning; cross-modal retrieval; nonlinear embeddings; RBF interpolators;

D O I：

10.1109/icip.2019.8803196

中图分类号：

TB8 [摄影技术];

学科分类号：

0804 ;

摘要：

In practical machine learning settings, there often exist relations or links between data from different modalities. The goal of multimodal learning algorithms is to efficiently use the information available in different modalities to solve multi-modal classification or retrieval problems. In this study, we propose a multi-modal supervised representation learning algorithm based on nonlinear dimensionality reduction. Nonlinear embeddings often yield more flexible representations compared to linear counterparts especially in case of high dissimilarity between the data geometries in different modalities. Based on recent performance bounds on nonlinear dimensionality reduction, we propose an optimization objective aiming to improve the intra- and inter-modal within-class compactness and between-class separation, as well as the Lipschitz regularity of the interpolator that generalizes the embedding to the whole data space. Experiments in multi-view face recognition and image-text retrieval applications show that the proposed method yields promising performance in comparison with state-of-the-art multi-modal learning methods.

引用

收藏

页码：2139 / 2143

页数：5

相关论文

共 50 条

[31] Nonlinear supervised dimensionality reduction via smooth regular embeddings [J].

Ornek, Cem ;

Vural, Elif .

PATTERN RECOGNITION, 2019, 87 :55-66

[32] Improving cross-modal and multi-modal retrieval combining content and semantics similarities with probabilistic model [J].

Shixun Wang ;

Peng Pan ;

Yansheng Lu ;

Liang Xie .

Multimedia Tools and Applications, 2015, 74 :2009-2032

[33] Cross-modal Retrieval Based on Multi-modal Large Model With Convolutional Attention and Adversarial Training [J].

Nan, Haijing ;

Miao, Zicong ;

Wang, Kehan ;

Li, Weize ;

Chen, Hui ;

Wu, Xiaoqing ;

Pan, Xiaodong ;

Qiu, Wenying ;

Zhang, Haoxiang .

PROCEEDINGS OF THE 2024 INTERNATIONAL WORKSHOP ON IOT DATASETS FOR MULTI-MODAL LARGE MODEL, IOTMMIM 2024, 2024, :50-56

[34] Improving cross-modal and multi-modal retrieval combining content and semantics similarities with probabilistic model [J].

Wang, Shixun ;

Pan, Peng ;

Lu, Yansheng ;

Xie, Liang .

MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (06) :2009-2032

[35] MT-CMVAD: A Multi-Modal Transformer Framework for Cross-Modal Video Anomaly Detection [J].

Ding, Hantao ;

Lou, Shengfeng ;

Ye, Hairong ;

Chen, Yanbing .

APPLIED SCIENCES-BASEL, 2025, 15 (12)

[36] C2MR: Continual Cross-Modal Retrieval for Streaming Multi-modal Data [J].

Zhang, Huaiwen ;

Yang, Yang ;

Qi, Fan ;

Qian, Shengsheng ;

Xu, Changsheng .

PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, :8963-8974

[37] Adaptive graph weighting for multi-view dimensionality reduction [J].

Xu, Xinyi ;

Yang, Yanhua ;

Deng, Cheng ;

Nie, Feiping .

SIGNAL PROCESSING, 2019, 165 :186-196

[38] Multi-Task Multi-modal Semantic Hashing for Web Image Retrieval with Limited Supervision [J].

Xie, Liang ;

Zhu, Lei ;

Cheng, Zhiyong .

MULTIMEDIA MODELING (MMM 2017), PT I, 2017, 10132 :465-477

[39] MMGCN: Multi-modal multi-view graph convolutional networks for cancer prognosis prediction [J].

Yang, Ping ;

Chen, Wengxiang ;

Qiu, Hang .

COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2024, 257

[40] EDMH: Efficient discrete matrix factorization hashing for multi-modal similarity retrieval [J].

Yang, Fan ;

Ding, Xiaojian ;

Ma, Fumin ;

Tong, Deyu ;

Cao, Jie .

INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (03)

← 1 2 3 4 5 →