A Multimedia Retrieval Framework Based on Semi-Supervised Ranking and Relevance Feedback

被引:328
作者
Yang, Yi [1 ,2 ]
Nie, Feiping [3 ]
Xu, Dong [4 ]
Luo, Jiebo [5 ]
Zhuang, Yueting [1 ]
Pan, Yunhe [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci, Hangzhou 310027, Zhejiang, Peoples R China
[2] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA
[3] Univ Texas Arlington, Dept Comp Sci & Engn, Arlington, TX 76019 USA
[4] Nanyang Technol Univ, Sch Comp Engn, Singapore 639798, Singapore
[5] Eastman Kodak Co, Kodak Res Labs, Rochester, NY 14650 USA
基金
中国国家自然科学基金; 美国国家科学基金会; 新加坡国家研究基金会;
关键词
Content-based multimedia retrieval; semi-supervised learning; ranking algorithm; relevance feedback; cross-media retrieval; image retrieval; 3D motion data retrieval;
D O I
10.1109/TPAMI.2011.170
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a new framework for multimedia content analysis and retrieval which consists of two independent algorithms. First, we propose a new semi-supervised algorithm called ranking with Local Regression and Global Alignment (LRGA) to learn a robust Laplacian matrix for data ranking. In LRGA, for each data point, a local linear regression model is used to predict the ranking scores of its neighboring points. A unified objective function is then proposed to globally align the local models from all the data points so that an optimal ranking score can be assigned to each data point. Second, we propose a semi-supervised long-term Relevance Feedback (RF) algorithm to refine the multimedia data representation. The proposed long-term RF algorithm utilizes both the multimedia data distribution in multimedia feature space and the history RF information provided by users. A trace ratio optimization problem is then formulated and solved by an efficient algorithm. The algorithms have been applied to several content-based multimedia retrieval applications, including cross-media retrieval, image retrieval, and 3D motion/pose data retrieval. Comprehensive experiments on four data sets have demonstrated its advantages in precision, robustness, scalability, and computational efficiency.
引用
收藏
页码:723 / 742
页数:20
相关论文
共 43 条
[41]  
Zhou Dengyong, 2004, ICML 2004 WORKSHOP S, P132
[42]  
Zhu X., 2008, COMPUTER SCI T, V1530
[43]   Mining semantic correlation of heterogeneous multimedia data for cross-media retrieval [J].
Zhuang, Yue-Ting ;
Yang, Yi ;
Wu, Fei .
IEEE TRANSACTIONS ON MULTIMEDIA, 2008, 10 (02) :221-229