Locally Discriminative Coclustering

被引:26
作者
Zhang, Lijun [1 ]
Chen, Chun [1 ]
Bu, Jiajun [1 ]
Chen, Zhengguang [1 ]
Cai, Deng [2 ]
Han, Jiawei [3 ]
机构
[1] Zhejiang Univ, Zhejiang Prov Key Lab Serv Robot, Coll Comp Sci, Hangzhou 310027, Peoples R China
[2] Zhejiang Univ, State Key Lab CAD&CG, Coll Comp Sci, Hangzhou 310027, Peoples R China
[3] Univ Illinois, Dept Comp Sci, Siebel Ctr Comp Sci, Urbana, IL 61801 USA
关键词
Coclustering; clustering; bipartite graph; local linear regression; PREDICTION; DISCOVERY;
D O I
10.1109/TKDE.2011.71
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Different from traditional one-sided clustering techniques, coclustering makes use of the duality between samples and features to partition them simultaneously. Most of the existing co-clustering algorithms focus on modeling the relationship between samples and features, whereas the intersample and interfeature relationships are ignored. In this paper, we propose a novel coclustering algorithm named Locally Discriminative Coclustering (LDCC) to explore the relationship between samples and features as well as the intersample and interfeature relationships. Specifically, the sample-feature relationship is modeled by a bipartite graph between samples and features. And we apply local linear regression to discovering the intrinsic discriminative structures of both sample space and feature space. For each local patch in the sample and feature spaces, a local linear function is estimated to predict the labels of the points in this patch. The intersample and interfeature relationships are thus captured by minimizing the fitting errors of all the local linear functions. In this way, LDCC groups strongly associated samples and features together, while respecting the local structures of both sample and feature spaces. Our experimental results on several benchmark data sets have demonstrated the effectiveness of the proposed method.
引用
收藏
页码:1025 / 1035
页数:11
相关论文
共 43 条
[1]  
[Anonymous], 2005, Data Mining: Concepts and Techniques
[2]  
[Anonymous], 2003, Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining
[3]  
[Anonymous], 2003, P 26 ANN INT ACM SIG, DOI DOI 10.1145/860435.860485
[4]  
Bach F.R., 2008, Advances in Neural Information Processing Systems, P49
[5]  
Banerjee A, 2007, J MACH LEARN RES, V8, P1919
[6]  
Belkin M, 2002, ADV NEUR IN, V14, P585
[7]   Metagenes and molecular pattern discovery using matrix factorization [J].
Brunet, JP ;
Tamayo, P ;
Golub, TR ;
Mesirov, JP .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (12) :4164-4169
[8]   Document clustering using locality preserving indexing [J].
Cai, D ;
He, XF ;
Han, JW .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (12) :1624-1637
[9]  
Cheng Y, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P93
[10]  
DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO