Flexible and Robust Multi-Network Clustering

被引:38
作者
Ni, Jingchao [1 ]
Tong, Hanghang [2 ]
Fan, Wei [3 ]
Zhang, Xiang [1 ]
机构
[1] Case Western Reserve Univ, Dept Elect Engn & Comp Sci, Cleveland, OH 44106 USA
[2] Arizona State Univ, Decis Syst Engn, Informat, Sch Comp, Tempe, AZ 85287 USA
[3] Baidu Res Big Data Lab, Sunnyvale, CA USA
来源
KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING | 2015年
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
Network of Networks; Graph clustering;
D O I
10.1145/2783258.2783262
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Integrating multiple graphs (or networks) has been shown to be a promising approach to improve the graph clustering accuracy. Various multi-view and multi-domain graph clustering methods have recently been developed to integrate multiple networks. In these methods, a network is treated as a view or domain. The key assumption is that there is a common clustering structure shared across all views (domains), and different views (domains) provide compatible and complementary information on this underlying clustering structure. However, in many emerging real-life applications, different networks have different data distributions, where the assumption that all networks share a single common clustering structure does not hold. In this paper, we propose a flexible and robust framework that allows multiple underlying clustering structures across different networks. Our method models the domain similarity as a network, which can be utilized to regularize the clustering structures in different networks. We refer to such a data model as a network of networks (NoN). We develop NoNCLus, a novel method based on non-negative matrix factorization (NMF), to cluster an NoN. We provide rigorous theoretical analysis of NoNCLus in terms of its correctness, convergence and complexity. Extensive experimental results on synthetic and real-life datasets show the effectiveness of our method.
引用
收藏
页码:835 / 844
页数:10
相关论文
共 37 条
[1]  
Akata Z., 2011, P 16 COMP VIS WINT W, P1
[2]  
[Anonymous], 1993, Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment
[3]  
[Anonymous], 2000, NIPS
[4]  
[Anonymous], KDD
[5]  
[Anonymous], 2007, Proceedings of the International Conference on Machine Learning, DOI DOI 10.1145/1273496.1273642
[6]  
[Anonymous], 2014, KDD
[7]  
[Anonymous], 2003, Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining
[8]  
[Anonymous], 2006, KDD
[9]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[10]   An ensemble framework for clustering protein-protein interaction networks [J].
Asur, Sitaram ;
Ucar, Duygu ;
Parthasarathy, Srinivasan .
BIOINFORMATICS, 2007, 23 (13) :I29-I40