Semi-Supervised Heterogeneous Fusion for Multimedia Data Co-Clustering

被引:52
作者
Meng, Lei [1 ]
Tan, Ah-Hwee [1 ]
Xu, Dong [1 ]
机构
[1] Nanyang Technol Univ, Sch Comp Engn, Singapore 639798, Singapore
基金
新加坡国家研究基金会;
关键词
Semi-supervised learning; heterogeneous data co-clustering; multimedia data mining;
D O I
10.1109/TKDE.2013.47
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Co-clustering is a commonly used technique for tapping the rich meta-information of multimedia web documents, including category, annotation, and description, for associative discovery. However, most co-clustering methods proposed for heterogeneous data do not consider the representation problem of short and noisy text and their performance is limited by the empirical weighting of the multi-modal features. In this paper, we propose a generalized form of Heterogeneous Fusion Adaptive Resonance Theory, called GHF-ART, for co-clustering of large-scale web multimedia documents. By extending the two-channel Heterogeneous Fusion ART (HF-ART) to multiple channels, GHF-ART is designed to handle multimedia data with an arbitrarily rich level of meta-information. For handling short and noisy text, GHF-ART does not learn directly from the textual features. Instead, it identifies key tags by learning the probabilistic distribution of tag occurrences. More importantly, GHF-ART incorporates an adaptive method for effective fusion of multi-modal features, which weights the features of multiple data sources by incrementally measuring the importance of feature modalities through the intra-cluster scatters. Extensive experiments on two web image data sets and one text document set have shown that GHF-ART achieves significantly better clustering performance and is much faster than many existing state-of-the-art algorithms.
引用
收藏
页码:2293 / 2306
页数:14
相关论文
共 40 条
[1]  
[Anonymous], P ACM SIGIR C RES DE
[2]  
[Anonymous], 2010, P 18 ACM INT C MULT
[3]  
[Anonymous], 2004, Proceedings of the 12th ACM International Conference on Multimedia, DOI DOI 10.1145/1027527.1027747
[4]  
[Anonymous], P CVPR
[5]  
[Anonymous], 2009, P ACM INT C IM VID R
[6]  
[Anonymous], 2006, P 23 INT C MACHINE L, DOI DOI 10.1145/1143844.1143918
[7]  
[Anonymous], 2001, TECHNICAL REPORT
[8]  
[Anonymous], 2011, P 19 ACM INT C MULTI
[9]  
[Anonymous], 2009, P 18 INT C WORLD WID
[10]  
Bekkerman R., 2006, P ICML WORKSH LEARN