Efficient eigen-updating for spectral graph clustering

被引:30
作者
Dhanjal, Charanpal [1 ]
Gaudel, Romaric [2 ]
Clemencon, Stephan [3 ]
机构
[1] UPMC, LIP6, F-75252 Paris 05, France
[2] Univ Lille 3, F-59653 Villeneuve Dascq, France
[3] Telecom ParisTech, F-75634 Paris 13, France
关键词
Spectral graph clustering; Eigen-decomposition; Unsupervised learning; Normalised Laplacian; COMPONENT ANALYSIS; ALGORITHMS; MATRIX;
D O I
10.1016/j.neucom.2013.11.015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Partitioning a graph into groups of vertices such that those within each group are more densely connected than vertices assigned to different groups, known as graph clustering, is often used to gain insight into the organisation of large scale networks and for visualisation purposes. Whereas a large number of dedicated techniques have been recently proposed for static graphs, the design of on-line graph clustering methods tailored for evolving networks is a challenging problem, and much less documented in the literature. Motivated by the broad variety of applications concerned, ranging from the study of biological networks to the analysis of networks of scientific references through the exploration of communications networks such as the World Wide Web, it is the main purpose of this paper to introduce a novel, computationally efficient, approach to graph clustering in the evolutionary context. Namely, the method promoted in this article can be viewed as an incremental eigenvalue solution for the spectral clustering method described by Ng et al. (2001) [25]. The incremental eigenvalue solution is a general technique for finding the approximate eigenvectors of a symmetric matrix given a change. As well as outlining the approach in detail, we present a theoretical bound on the quality of the approximate eigenvectors using perturbation theory. We then derive a novel spectral clustering algorithm called Incremental Approximate Spectral Clustering (IASC). The IASC algorithm is simple to implement and its efficacy is demonstrated on both synthetic and real datasets modelling the evolution of a HIV epidemic, a citation network and the purchase history graph of an e-commerce website. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:440 / 452
页数:13
相关论文
共 43 条
  • [11] Flake G. W., 2004, Internet Mathematics, V1, P385, DOI DOI 10.1080/15427951.2004.10129093
  • [12] Community detection in graphs
    Fortunato, Santo
    [J]. PHYSICS REPORTS-REVIEW SECTION OF PHYSICS LETTERS, 2010, 486 (3-5): : 75 - 174
  • [13] Spectral grouping using the Nystrom method
    Fowlkes, C
    Belongie, S
    Chung, F
    Malik, J
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (02) : 214 - 225
  • [14] Enumeration of cospectral graphs
    Haemers, WH
    Spence, E
    [J]. EUROPEAN JOURNAL OF COMBINATORICS, 2004, 25 (02) : 199 - 211
  • [15] Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions
    Halko, N.
    Martinsson, P. G.
    Tropp, J. A.
    [J]. SIAM REVIEW, 2011, 53 (02) : 217 - 288
  • [16] Analysis of a complex of statistical variables into principal components
    Hotelling, H
    [J]. JOURNAL OF EDUCATIONAL PSYCHOLOGY, 1933, 24 : 417 - 441
  • [17] Kumar S, 2009, Proc. 12th Int. Conf. Artificial Intelligence and Statistics, P304
  • [18] KWOK JT, 2003, P ICANN IST TURK JUN, P270
  • [19] Lehoucq R., 1998, ARPACK Users' Guide: Solution of LargeScale Eigenvalue Problems with Implicitly Restarted Arnoldi Methods
  • [20] Mahdavi M., 2012, ARXIV12090001