Generalized Multi-View Embedding for Visual Recognition and Cross-Modal Retrieval

被引:80
作者
Cao, Guanqun
Iosifidis, Alexandros
Chen, Ke
Gabbouj, Moncef
机构
[1] Laboratory of Signal Processing, Tampere University of Technology, Tampere
[2] Department of Engineering, Electrical and Computer Engineering, Aarhus University, Aarhus
基金
芬兰科学院;
关键词
Cross-modal retrieval; multi-view discriminant embedding; multi-view subspace learning; visual recognition; CANONICAL CORRELATION-ANALYSIS; REPRESENTATION; SPACE;
D O I
10.1109/TCYB.2017.2742705
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, the problem of multi-view embedding from different visual cues and modalities is considered. We propose a unified solution for subspace learning methods using the Rayleigh quotient, which is extensible for multiple views, supervised learning, and nonlinear embeddings. Numerous methods including canonical correlation analysis, partial least square regression, and linear discriminant analysis are studied using specific intrinsic and penalty graphs within the same framework. Nonlinear extensions based on kernels and (deep) neural networks are derived, achieving better performance than the linear ones. Moreover, a novel multi-view modular discriminant analysis is proposed by taking the view difference into consideration. We demonstrate the effectiveness of the proposed multi-view embedding methods on visual object recognition and cross-modal image retrieval, and obtain superior results in both applications compared to related methods.
引用
收藏
页码:2542 / 2555
页数:14
相关论文
共 60 条
[1]  
Andrew G., 2013, ICML, P1247
[2]  
[Anonymous], 2006, Proceedings of the 21st National Conference on Artificial Intelligence
[3]   Generalized discriminant analysis using a kernel approach [J].
Baudat, G ;
Anouar, FE .
NEURAL COMPUTATION, 2000, 12 (10) :2385-2404
[4]  
Borga M., 2001, CANONICAL CORRELATIO
[5]  
Cao G., 2016, P 13 INT S WIR COMM, P1
[6]   PCANet: A Simple Deep Learning Baseline for Image Classification? [J].
Chan, Tsung-Han ;
Jia, Kui ;
Gao, Shenghua ;
Lu, Jiwen ;
Zeng, Zinan ;
Ma, Yi .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (12) :5017-5032
[7]   On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval [J].
Costa Pereira, Jose ;
Coviello, Emanuele ;
Doyle, Gabriel ;
Rasiwasia, Nikhil ;
Lanckriet, Gert R. G. ;
Levy, Roger ;
Vasconcelos, Nuno .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (03) :521-535
[8]  
Diethe T., 2008, P NIPS WORKSH LEARN, P1
[9]  
Dorfer M., 2015, arXiv, P1, DOI [10.48550/arXiv.1511.04707, DOI 10.48550/ARXIV.1511.04707]
[10]  
Farhadi A, 2009, PROC CVPR IEEE, P1778, DOI 10.1109/CVPRW.2009.5206772