Learning semantic correlations for cross-media retrieval

被引:22
作者
Wu, Fei [1 ]
Zhang, Hong [1 ]
Zhuang, Yueting [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Engn, Hangzhou 310027, Peoples R China
来源
2006 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP 2006, PROCEEDINGS | 2006年
基金
中国国家自然科学基金;
关键词
cross-media retrieval; canonical correlation; relevance feedback;
D O I
10.1109/ICIP.2006.312707
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a novel cross-media retrieval approach. First, an isomorphic subspace is constructed based on Canonical Correlation Analysis (CCA) to learn multi-modal correlations of media objects; Second, polar coordinates are used to judge the general distance of media objects with different modalities in the subspace. Since the integrity of semantic correlations is not likely learned from limited training samples, users' relevance feedback is used to accurately refine cross-media similarities. We also propose methods to map new media objects into the learned subspace, and any new media object would be taken as query example. Experiment results show that our approaches are effective for cross-media retrieval, and meanwhile achieve a significant improvement over content-based image retrieval and content-based audio retrieval.
引用
收藏
页码:1465 / +
页数:2
相关论文
共 13 条
[1]  
Chang Yuan S, 2003, Integr Cancer Ther, V2, P13, DOI 10.1177/1534735403251167
[2]   ClassView:: Hierarchical video shot classification, indexing, and accessing [J].
Fan, JP ;
Elmagarmid, AK ;
Zhu, XQ ;
Aref, WG ;
Wu, LD .
IEEE TRANSACTIONS ON MULTIMEDIA, 2004, 6 (01) :70-86
[3]   Content-based audio classification and retrieval by support vector machines [J].
Guo, GD ;
Li, SZ .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2003, 14 (01) :209-215
[4]  
HARDOON DR, 2003, CSDTR0302 U LOND COM
[5]  
HE X, 2004, ACM MULT C NEW YORK
[6]   Relations between two sets of variates [J].
Hotelling, H .
BIOMETRIKA, 1936, 28 :321-377
[7]  
MADDAGE NC, 2004, ACM MUTL C
[8]  
MULLER M, 2005, P ACM SIGGRAPH
[9]  
Smoliar S. W., 1994, IEEE Multimedia, V1, P62, DOI 10.1109/93.311653
[10]  
WANG JZ, 1997, INT J DIGITAL LIB, V1, P311