Cross-media retrieval: Concepts, advances and challenges

被引:0
作者
Zhuang, Yueting [1 ]
Wu, Fei [1 ]
Zhang, Hong [1 ]
Yang, Yi [1 ]
机构
[1] Zhejiang Univ, Inst Artificial Intelligence, Hangzhou 310027, Peoples R China
来源
PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE: 50 YEARS' ACHIEVEMENTS, FUTURE DIRECTIONS AND SOCIAL IMPACTS | 2006年
关键词
cross-media retrieval; canonical correlation; manifold learning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-media retrieval is an emerging comprehensive research topic, which seeks to provide more effective retrieval approach so that internet users could query multimedia objects by examples in the form of different media. For example, users can query images by submitting an example audio clip in a cross-media retrieval system and vice versa. Clearly, a cross-media retrieval system better fits for human habits and is thus more powerful in retrieval performance. In order to achieve cross-media retrieval, we need to resolve the problem of semantic understanding and mappings among heterogeneous low-level multi-modal features spaces, such as judging the correlation between visual contents and auditory contents in accordance with human perception. In this paper we give the concept of cross-media retrieval, and two effective approaches for the two kinds of cross-media retrieval, namely Correlation Isomorphic Space Learning (CISL) for media object retrieval and Manifold Semantic Space Learning (MSSL) for multimedia document retrieval. CISL uses canonical correlation analysis to map pairs of heterogeneous multi-modal features into an integrity semantic subspace where canonical correlations are furthest preserved. MSSL implements manifolds learning to explore the relationship among multimedia documents and media objects within them respectively. Experiment results are encouraging and indicate that the performance of the proposed approaches is effective.
引用
收藏
页码:377 / 380
页数:4
相关论文
共 10 条
[1]   Crossmodal processing in the human brain: Insights from functional neuroimaging studies [J].
Calvert, GA .
CEREBRAL CORTEX, 2001, 11 (12) :1110-1123
[2]  
DAVIS M, 2004, ACM INT C MULT NEW Y
[3]  
HARDOON DR, 2003, CSDTR0302 U LOND ROY
[4]  
JEON J, 2003, SIGIR, P119
[5]  
Ma Q., 2004, ACM INT WORKSH MULT, P45
[6]   HEARING LIPS AND SEEING VOICES [J].
MCGURK, H ;
MACDONALD, J .
NATURE, 1976, 264 (5588) :746-748
[7]  
PAN JY, 2004, P KDD 2004 AUG 22 25
[8]  
Saul L., 2006, Semisupervised Learning, VVolume 3
[9]  
Wu F, 2005, LECT NOTES COMPUT SC, V3767, P993
[10]  
WU F, 2006, 13 INT C IM PROC ICI