Fast Multiway Maximum Margin Clustering Based on Genetic Algorithm via the NystrOm Method

被引:0
作者
Kang, Ying [1 ,2 ]
Zhang, Dong [3 ,4 ]
Yu, Bo [1 ]
Gu, Xiaoyan [1 ]
Wang, Weiping [1 ]
Meng, Dan [1 ]
机构
[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
[3] State Key Lab High End Server & Storage Technol, Jinan, Peoples R China
[4] Inspur Grp Corp Ltd, Jinan, Peoples R China
来源
WEB-AGE INFORMATION MANAGEMENT (WAIM 2015) | 2015年 / 9098卷
关键词
Maximum margin clustering; NystrOm method; Genetic algorithm;
D O I
10.1007/978-3-319-21042-1_33
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Motivated by theories of support vector machine, the concept of maximum margin has been extended to the applications in the unsupervised scenario, developing a novel clustering method - maximum margin clustering (MMC). MMC shows an outstanding performance in computational accuracy, which is superior to other traditional clustering methods. But the integer programming of labels of data instances induces MMC to be a hard non-convex optimization problem to settle. Currently, many techniques like semi-definite programming, cutting plane etc. are embedded in MMC to tackle this problem. However, the increasing time complexity and premature convergence of these methods limit the analytic capability of MMC for large datasets. This paper proposes a fast multiway maximum margin clustering method based on genetic algorithm (GAM3C). GAM3C initially adopts the NystrOm method to generate a low-rank approximate kernel matrix in the dual form of MMC, reducing the scale of original problem and speeding up the subsequent analyzing process; and then makes use of the solution-space alternation of genetic algorithm to compute the non-convex optimization of MMC explicitly, obtaining the multiway clustering results simultaneously. Experimental results on real world datasets reflect that GAM3C outperforms the state-of-the-art maximum margin clustering algorithms in terms of computational accuracy and running time.
引用
收藏
页码:413 / 425
页数:13
相关论文
共 21 条
[1]   Multiway Spectral Clustering with Out-of-Sample Extensions through Weighted Kernel PCA [J].
Alzate, Carlos ;
Suykens, Johan A. K. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (02) :335-347
[2]  
[Anonymous], P INT C MACH LEARN I
[3]  
[Anonymous], 2004, P 17 INT C NEUR INF
[4]  
Bezdek J. C., 2003, Neural, Parallel & Scientific Computations, V11, P351
[5]  
Choromanska A, 2013, LECT NOTES ARTIF INT, V8139, P367
[6]   On the algorithmic implementation of multiclass kernel-based vector machines [J].
Crammer, K ;
Singer, Y .
JOURNAL OF MACHINE LEARNING RESEARCH, 2002, 2 (02) :265-292
[7]  
Drineas P, 2005, J MACH LEARN RES, V6, P2153
[8]  
Jain A. K., 1988, Algorithms for Clustering Data
[9]  
Jin R., 1994, ADV NIPS, P1417
[10]   An efficient k-means clustering algorithm:: Analysis and implementation [J].
Kanungo, T ;
Mount, DM ;
Netanyahu, NS ;
Piatko, CD ;
Silverman, R ;
Wu, AY .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (07) :881-892