Scalable multi-label canonical correlation analysis for cross-modal retrieval

被引:17
作者
Shu, Xin [1 ,2 ]
Zhao, Guoying [2 ]
机构
[1] Nanjing Agr Univ, Coll Artificial Intelligence, 1 Wei Gang, Nanjing, Peoples R China
[2] Univ Oulu, Ctr Machine Vis & Signal Anal, Oulu, Finland
基金
芬兰科学院; 中国国家自然科学基金;
关键词
Canonical correlation analysis; Semantic transformation; Cross-modal retrieval; Singular value decomposition;
D O I
10.1016/j.patcog.2021.107905
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-label canonical correlation analysis (ml-CCA) has been developed for cross-modal retrieval. However, the computation of ml-CCA involves dense matrices eigendecomposition, which can be computationally expensive. In addition, ml-CCA only takes semantic correlation into account which ignores the cross-modal feature correlation. In this paper, we propose a novel framework to simultaneously integrate the semantic correlation and feature correlation for cross-modal retrieval. By using the semantic transformation, we show that our model can avoid computing the covariance matrix explicitly which is a huge save of computational cost. Further analysis shows that our proposed method can be solved via singular value decomposition which has linear time complexity. Experimental results on three multi-label datasets have demonstrated the accuracy and efficiency of our proposed method. ? 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页数:10
相关论文
共 32 条
[1]  
Andrienko G., 2013, Introduction, P1
[2]  
[Anonymous], 2009, P ACM INT C IM VID R
[3]  
[Anonymous], 2013, P 27 AAAI C ART INT
[4]   Representation learning using step-based deep multi-modal autoencoders [J].
Bhatt, Gaurav ;
Jha, Piyush ;
Raman, Balasubramanian .
PATTERN RECOGNITION, 2019, 95 :12-23
[5]   Generalized Multi-View Embedding for Visual Recognition and Cross-Modal Retrieval [J].
Cao, Guanqun ;
Iosifidis, Alexandros ;
Chen, Ke ;
Gabbouj, Moncef .
IEEE TRANSACTIONS ON CYBERNETICS, 2018, 48 (09) :2542-2555
[6]   On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval [J].
Costa Pereira, Jose ;
Coviello, Emanuele ;
Doyle, Gabriel ;
Rasiwasia, Nikhil ;
Lanckriet, Gert R. G. ;
Levy, Roger ;
Vasconcelos, Nuno .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (03) :521-535
[7]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[8]  
Golub G. H., 2012, Matrix computations, V3
[9]   A Multi-View Embedding Space for Modeling Internet Images, Tags, and Their Semantics [J].
Gong, Yunchao ;
Ke, Qifa ;
Isard, Michael ;
Lazebnik, Svetlana .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2014, 106 (02) :210-233
[10]   Relations between two sets of variates [J].
Hotelling, H .
BIOMETRIKA, 1936, 28 :321-377