OPTIMALITY OF SPECTRAL CLUSTERING IN THE GAUSSIAN MIXTURE MODEL

被引:28
作者
Loeffler, Matthias [1 ]
Zhang, Anderson Y. [2 ]
Zhou, Harrison H. [3 ]
机构
[1] Swiss Fed Inst Technol, Seminar Stat, Zurich, Switzerland
[2] Univ Penn, Dept Stat, Philadelphia, PA 19104 USA
[3] Yale Univ, Dept Stat & Data Sci, New Haven, CT 06520 USA
基金
英国工程与自然科学研究理事会;
关键词
Spectral clustering; K-means; Gaussian mixture model; Spectral perturbation; COMMUNITY DETECTION; ALGORITHM; CONSISTENCY; ASYMPTOTICS; MATRICES; NUMBER; GRAPHS; FORMS;
D O I
10.1214/20-AOS2044
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Spectral clustering is one of the most popular algorithms to group high-dimensional data. It is easy to implement and computationally efficient. Despite its popularity and successful applications, its theoretical properties have not been fully understood. In this paper, we show that spectral clustering is minimax optimal in the Gaussian mixture model with isotropic covariance matrix, when the number of clusters is fixed and the signal-to-noise ratio is large enough. Spectral gap conditions are widely assumed in the literature to analyze spectral clustering. On the contrary, these conditions are not needed to establish optimality of spectral clustering in this paper.
引用
收藏
页码:2506 / 2530
页数:25
相关论文
共 71 条
  • [1] ABBE E., 2020, ANP THEORY PCA SPECT
  • [2] Abbe E, 2020, ANN STAT, V48, P1452, DOI [10.1214/19-aos1854, 10.1214/19-AOS1854]
  • [3] ALPERT CJ, 1995, DES AUT CON, P195
  • [4] Anandkumar A, 2014, J MACH LEARN RES, V15, P2239
  • [5] Bach FR, 2006, J MACH LEARN RES, V7, P1963
  • [6] Balakrishnan S., 2011, Advances in Neural Information Processing Systems, P954
  • [7] Laplacian eigenmaps for dimensionality reduction and data representation
    Belkin, M
    Niyogi, P
    [J]. NEURAL COMPUTATION, 2003, 15 (06) : 1373 - 1396
  • [8] Chaudhuri Kamalika, 2012, JMLR WORKSHOP C P, P35
  • [9] CHEN X., 2020, CUTOFF EXACT RECOVER
  • [10] Graph Partitioning via Adaptive Spectral Techniques
    Coja-Oghlan, Amin
    [J]. COMBINATORICS PROBABILITY & COMPUTING, 2010, 19 (02) : 227 - 284