Fast Constrained Spectral Clustering and Cluster Ensemble with Random Projection

被引:9
作者
Liu, Wenfen [1 ,2 ,3 ]
Ye, Mao [4 ]
Wei, Jianghong [3 ]
Hu, Xuexian [3 ]
机构
[1] Guilin Univ Elect Technol, Sch Comp Sci & Informat Secur, Guangxi Key Lab Cryptogp & Informat Secur, Guilin 541004, Peoples R China
[2] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100876, Peoples R China
[3] State Key Lab Math Engn & Adv Comp, Zhengzhou 450002, Henan, Peoples R China
[4] Natl Univ Def Technol, Nanjing 210012, Jiangsu, Peoples R China
关键词
DIMENSIONALITY REDUCTION;
D O I
10.1155/2017/2658707
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Constrained spectral clustering (CSC) method can greatly improve the clustering accuracy with the incorporation of constraint information into spectral clustering and thus has been paid academic attention widely. In this paper, we propose a fast CSC algorithm via encoding landmark-based graph construction into a new CSC model and applying random sampling to decrease the data size after spectral embedding. Compared with the original model, the new algorithm has the similar results with the increase of its model size asymptotically; compared with the most efficient CSC algorithm known, the new algorithm runs faster and has a wider range of suitable data sets. Meanwhile, a scalable semisupervised cluster ensemble algorithm is also proposed via the combination of our fast CSC algorithm and dimensionality reduction with random projection in the process of spectral ensemble clustering. We demonstrate by presenting theoretical analysis and empirical results that the new cluster ensemble algorithm has advantages in terms of efficiency and effectiveness. Furthermore, the approximate preservation of random projection in clustering accuracy proved in the stage of consensus clustering is also suitable for the weighted kappa-means clustering and thus gives the theoretical guarantee to this special kind of kappa-means clustering where each point has its corresponding weight.
引用
收藏
页数:14
相关论文
共 45 条
[1]   Database-friendly random projections: Johnson-Lindenstrauss with binary coins [J].
Achlioptas, D .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2003, 66 (04) :671-687
[2]  
Aggarwal C. C., 2014, DATA CLASSIFICATION
[3]  
[Anonymous], 2007, SOC IND APPL MATH
[4]  
[Anonymous], 2015, P 21 ACM SIGKDD INT
[5]  
[Anonymous], 2003, Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence
[6]  
[Anonymous], 2015, E-Health and Bioengineering Conference (EHB)
[7]  
[Anonymous], SOFT COMPUT
[8]  
[Anonymous], PERVASIVE MOBILE COM
[9]  
Boutsidis C., 2010, ADV NEURAL INFORM PR, V23, P298
[10]   Randomized Dimensionality Reduction for k-Means Clustering [J].
Boutsidis, Christos ;
Zouzias, Anastasios ;
Mahoney, Michael W. ;
Drineas, Petros .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2015, 61 (02) :1045-1062