Certifying Global Optimality of Graph Cuts via Semidefinite Relaxation: A Performance Guarantee for Spectral Clustering

被引:0
作者
Shuyang Ling
Thomas Strohmer
机构
[1] New York University,Courant Institute of Mathematical Sciences
[2] University of California Davis,Department of Mathematics
来源
Foundations of Computational Mathematics | 2020年 / 20卷
关键词
Semidefinite programming; Graph partition; Unsupervised learning; Spectral clustering; Community detection; Graph Laplacian; 90C34; 90C27; 90C46; 60B20;
D O I
暂无
中图分类号
学科分类号
摘要
Spectral clustering has become one of the most widely used clustering techniques when the structure of the individual clusters is non-convex or highly anisotropic. Yet, despite its immense popularity, there exists fairly little theory about performance guarantees for spectral clustering. This issue is partly due to the fact that spectral clustering typically involves two steps which complicated its theoretical analysis: First, the eigenvectors of the associated graph Laplacian are used to embed the dataset, and second, k-means clustering algorithm is applied to the embedded dataset to get the labels. This paper is devoted to the theoretical foundations of spectral clustering and graph cuts. We consider a convex relaxation of graph cuts, namely ratio cuts and normalized cuts, that makes the usual two-step approach of spectral clustering obsolete and at the same time gives rise to a rigorous theoretical analysis of graph cuts and spectral clustering. We derive deterministic bounds for successful spectral clustering via a spectral proximity condition that naturally depends on the algebraic connectivity of each cluster and the inter-cluster connectivity. Moreover, we demonstrate by means of some popular examples that our bounds can achieve near optimality. Our findings are also fundamental to the theoretical understanding of kernel k-means. Numerical simulations confirm and complement our analysis.
引用
收藏
页码:367 / 421
页数:54
相关论文
共 53 条
[1]  
Abbe E(2017)Community detection and stochastic block models: recent developments The Journal of Machine Learning Research 18 6446-6531
[2]  
Abbe E(2016)Exact recovery in the stochastic block model IEEE Transactions on Information Theory 62 471-487
[3]  
Bandeira AS(2009)NP-hardness of Euclidean sum-of-squares clustering Machine learning 75 245-248
[4]  
Hall G(2018)On semidefinite relaxations for the block model The Annals of Statistics 46 149-179
[5]  
Aloise D(2009)Expander flows, geometric embeddings and graph partitioning Journal of the ACM (JACM) 56 5-1396
[6]  
Deshpande A(2003)Laplacian eigenmaps for dimensionality reduction and data representation Neural Computation 15 1373-30
[7]  
Hansen P(2006)Diffusion maps Applied and Computational Harmonic Analysis 21 5-7431
[8]  
Popat P(2005)Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps Proceedings of the National Academy of Sciences of the United States of America 102 7426-46
[9]  
Amini AA(1970)The rotation of eigenvectors by a perturbation. iii SIAM Journal on Numerical Analysis 7 1-1085
[10]  
Levina E(1992)New spectral methods for ratio cut partitioning and clustering IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 11 1074-642