Multilevel core-sets based aggregation clustering algorithm

被引:6
作者
Ma, Ru-Ning [1 ]
Wang, Xiu-Li [1 ]
Ding, Jun-Di [2 ]
机构
[1] College of Science, Nanjing University of Aeronautics and Astronautics
[2] School of Computer Science and Technology, Nanjing University of Science and Technology
来源
Ruan Jian Xue Bao/Journal of Software | 2013年 / 24卷 / 03期
关键词
Aggregation; Core-set; Large size; Multilevel;
D O I
10.3724/SP.J.1001.2013.04322
中图分类号
学科分类号
摘要
Many classical clustering algorithms like Average-link, K-means, K-medoids, Clara, Clarans and so on are all based on a single cluster-center and are only apt to discover convex-structured clusters. Other methods, e.g., CURE and DBSCAN, use more than one point to represent a cluster and can find some well-separated clusters of arbitrary shape. However, they only consider the original scale of the input data; thus, they cannot depart over-lapped or noisy clusters. To this end, this paper is used to propose a multilevel core-set based agglomerative clustering algorithm (MulCA). The idea of MulCA is that the clustering structure is described by multi-level core set. Clustering process is achieved through procedure which the top of the core set automatically becomes the underlying data set. In addition, through the introduction of random sampling based (-core set (RBC), MulCA algorithm is applied to large-scale data sets. A large number of numerical experiments fully verify the algorithm MulCA. © Copyright 2013, Institute of Software, the Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:490 / 506
页数:16
相关论文
共 21 条
[1]  
Kaufman L., Rousseeuw P.J., Finding Groups in Data: An Introduction to Cluster Analysis, (1990)
[2]  
Jain A.K., Murty M.N., Flynn P.J., Data clustering: A review, ACM Computing Surveys, 31, 3, pp. 264-323, (1999)
[3]  
Xu R., Wunsch D., Survey of clustering algorithms, IEEE Trans. on Neural Networks, 16, 3, pp. 645-678, (2005)
[4]  
Shi J.B., Malik J., Normalized cuts and image segmentation, IEEE Trans. on Pattern Analysis and Machine Intelligence, 22, 8, pp. 888-905, (2000)
[5]  
Pal N.R., Pal S.K., A review on image segmentation techniques, Pattern Recognition, 26, 9, pp. 1277-1294, (1993)
[6]  
Datta R., Joshi D., Li J., Wang J.Z., Image retrieval: Ideas, influences, and trends of the new age, ACM Computing Surveys, 40, 2, pp. 1-60, (2008)
[7]  
MacQueen J.B., Some methods for classification and analysis of multivariate observations, Proc. of the 5th Berkeley Symp. on Mathematical Statistics and Probability, pp. 281-297, (1967)
[8]  
Park H.S., Jun C.H., A simple and fast algorithm for K-methods clustering, Expert Systems with Applications, 36, 2, pp. 3336-3341, (2009)
[9]  
Ng R., Han J., CLARANS: A method for clustering objects for spatial data mining, IEEE Trans. on Knowledge and Data Engineering, 14, 5, pp. 1003-1016, (2002)
[10]  
Guha S., Rastogi R., Shim K., CURE: An efficient clustering algorithm for large databases, Proc. of the ACM SIGMOD, pp. 73-84, (1998)