Ranked k-medoids: A fast and accurate rank-based partitioning algorithm for clustering large datasets

被引:55
作者
Zadegan, Seyed Mohammad Razavi [1 ]
Mirzaie, Mehdi [2 ,3 ]
Sadoughi, Farahnaz [1 ]
机构
[1] Univ Tehran Med Sci, Sch Hlth Management & Informat Sci, Dept Hlth Informat Management, Tehran, Iran
[2] Shahid Beheshti Univ Med Sci, Fac Paramed Sci, Prote Res Ctr, Tehran, Iran
[3] Inst Res Fundamental Sci IPM, Sch Comp Sci, Dept Bioinformat, Tehran, Iran
关键词
Clustering analysis; Partitioning clustering; k-Medoids clustering; k-Harmonic means; External validation measures; HARMONIC MEANS;
D O I
10.1016/j.knosys.2012.10.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering analysis is the process of dividing a set of objects into none-overlapping subsets. Each subset is a cluster, such that objects in the cluster are similar to one another and dissimilar to the objects in the other clusters. Most of the algorithms in partitioning approach of clustering suffer from trapping in local optimum and the sensitivity to initialization and outliers. In this paper, we introduce a novel partitioning algorithm that its initialization does not lead the algorithm to local optimum and can find all the Gaussian-shaped clusters if it has the right number of them. In this algorithm, the similarity between pairs of objects are computed once and updating the medoids in each iteration costs O(k x m) where k is the number of clusters and m is the number of objects needed to update medoids of the clusters. Comparison between our algorithm and two other partitioning algorithms is performed by using four well-known external validation measures over seven standard datasets. The results for the larger datasets show the superiority of the proposed algorithm over two other algorithms in terms of speed and accuracy. (c) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:133 / 143
页数:11
相关论文
共 40 条