Efficient eigen-updating for spectral graph clustering

被引:30
作者
Dhanjal, Charanpal [1 ]
Gaudel, Romaric [2 ]
Clemencon, Stephan [3 ]
机构
[1] UPMC, LIP6, F-75252 Paris 05, France
[2] Univ Lille 3, F-59653 Villeneuve Dascq, France
[3] Telecom ParisTech, F-75634 Paris 13, France
关键词
Spectral graph clustering; Eigen-decomposition; Unsupervised learning; Normalised Laplacian; COMPONENT ANALYSIS; ALGORITHMS; MATRIX;
D O I
10.1016/j.neucom.2013.11.015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Partitioning a graph into groups of vertices such that those within each group are more densely connected than vertices assigned to different groups, known as graph clustering, is often used to gain insight into the organisation of large scale networks and for visualisation purposes. Whereas a large number of dedicated techniques have been recently proposed for static graphs, the design of on-line graph clustering methods tailored for evolving networks is a challenging problem, and much less documented in the literature. Motivated by the broad variety of applications concerned, ranging from the study of biological networks to the analysis of networks of scientific references through the exploration of communications networks such as the World Wide Web, it is the main purpose of this paper to introduce a novel, computationally efficient, approach to graph clustering in the evolutionary context. Namely, the method promoted in this article can be viewed as an incremental eigenvalue solution for the spectral clustering method described by Ng et al. (2001) [25]. The incremental eigenvalue solution is a general technique for finding the approximate eigenvectors of a symmetric matrix given a change. As well as outlining the approach in detail, we present a theoretical bound on the quality of the approximate eigenvectors using perturbation theory. We then derive a novel spectral clustering algorithm called Incremental Approximate Spectral Clustering (IASC). The IASC algorithm is simple to implement and its efficacy is demonstrated on both synthetic and real datasets modelling the evolution of a HIV epidemic, a citation network and the purchase history graph of an e-commerce website. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:440 / 452
页数:13
相关论文
共 43 条
  • [1] [Anonymous], 1990, MATRIX PERTURBATION
  • [2] [Anonymous], 2003, SIGKDD Explor., DOI 10.1145/980972.980992
  • [3] [Anonymous], BMC INFECT DIS
  • [4] Using linear algebra for intelligent information retrieval
    Berry, MW
    Dumais, ST
    OBrien, GW
    [J]. SIAM REVIEW, 1995, 37 (04) : 573 - 595
  • [5] Chung F.R.K., 1997, Spectral graph theory
  • [6] ROTATION OF EIGENVECTORS BY A PERTURBATION .3.
    DAVIS, C
    KAHAN, WM
    [J]. SIAM JOURNAL ON NUMERICAL ANALYSIS, 1970, 7 (01) : 1 - &
  • [7] Dhanjal Charanpal, 2011, 3 NIPS WORKSH DISCR
  • [8] Drineas P, 2005, J MACH LEARN RES, V6, P2153
  • [9] Fast Monte Carlo algorithms for matrices II: Computing a low-rank approximation to a matrix
    Drineas, Petros
    Kannan, Ravi
    Mahoney, Michael W.
    [J]. SIAM JOURNAL ON COMPUTING, 2006, 36 (01) : 158 - 183
  • [10] Erdos P., 1959, PUBL MATH-DEBRECEN, V6, P290, DOI DOI 10.5486/PMD.1959.6.3-4.12