Topic correlation model for cross-modal multimedia information retrieval

被引:0
作者
Zengchang Qin
Jing Yu
Yonghui Cong
Tao Wan
机构
[1] Beihang University,Intelligent Computing and Machine Learning Lab, School of ASEE
[2] Chinese Academy of Sciences,Institute of Information Engineering
[3] Beihang University,School of Biological Science and Medical Engineering
来源
Pattern Analysis and Applications | 2016年 / 19卷
关键词
Cross-modal multimedia retrieval; Topic correlation model; Topic models; Bag-of-features model;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, we present a simple and effective topic correlation model (TCM) for cross-modal multimedia retrieval by jointly modeling the text and image components in multimedia documents. In this model, the image component is represented by the bag-of-features model based on local scale-invariant feature transform features, meanwhile the text component is described by a topic distribution learned from a latent topic model. Statistical correlations between these two mid-level features are investigated by mapping them into a semantic space. These cross-modality correlations are used to calculate the conditional probabilities of answers in one modality while given query in the other modality. The model is tested on three cross-modal retrieval benchmark problems including Wikipedia documents in both English and Chinese. Experimental results have demonstrated that the new TCM model achieves the best performance compared to recent state-of-the-art cross-modal retrieval models on the given benchmarks.
引用
收藏
页码:1007 / 1022
页数:15
相关论文
共 26 条
[1]  
Carneiro G(2007)Supervised learning of semantic classes for image annotation and retrieval IEEE Trans Pattern Anal Mach Intell 29 394-410
[2]  
Chan A(2008)Real-time computerized annotation of pictures IEEE Trans Pattern Anal Mach Intell 30 985-1002
[3]  
Moreno P(2008)Image retrieval: ideas, influences, and trends of the new age ACM Comput Surv 40 1-60
[4]  
Vasconcelos N(2013)Multimedia search and retrieval using multimodal annotation propagation and indexing techniques Signal Process 28 351-367
[5]  
Li J(2003)Latent dirichlet allocation J Mach Learn Res 3 993-1022
[6]  
Wang J(1983)The laplacian pyramid as a compact image code IEEE Trans Commun 31 532-540
[7]  
Datta R(2007)LIBSVM: a library for support vector machines Probabilistic topic models. Signal Processing Magazine 27 55-65
[8]  
Joshi D(2011)Recall-precision trade-off: a derivation ACM Trans Intell Syst Technol 2 27-151
[9]  
Li J(1989)Fundamental structural principles of Chinese semantic syntax in terms of Chinese characters J Am Soc Inf Sci 40 145-13
[10]  
Wang J(2001)undefined Appl Linguist 1 3-undefined