k-MM: A Hybrid Clustering Algorithm Based on k-Means and k-Medoids

被引:5
作者
Drias, Habiba [1 ]
Cherif, Nadjib Fodil [1 ]
Kechid, Amine [1 ]
机构
[1] USTHB, Dept Comp Sci, LRIA, Algiers, Algeria
来源
ADVANCES IN NATURE AND BIOLOGICALLY INSPIRED COMPUTING | 2016年 / 419卷
关键词
Data mining; Clustering based on partitioning; K-means; PAM; Hybrid algorithm; Image clustering application;
D O I
10.1007/978-3-319-27400-3_4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
k-means and k-medoids have been the most popular clustering algorithms based on partitioning for many decades. When using heuristics such as Lloyd's algorithm, k-means is easy to implement and can be applied on large data sets. However, it presents drawbacks like the inefficiency of the used metric, the difficulty of the choice of the input k and the premature convergence. In contrast, k-medoids takes more time to come up with a clustering but ensures a better quality of the result. Moreover, it is more robust to noise and outliers. In this article, we design a hybrid algorithm, namely k-MM to take advantage of both algorithms. We experimented k-MM and we show that, when compared to k-means and k-medoids, it is very efficient and effective. We present also an application to image clustering and show that k-MM has the ability to discover clusters faster and more effectively than a recent work of the literature.
引用
收藏
页码:37 / 48
页数:12
相关论文
共 14 条
[1]  
[Anonymous], 2011, J INFORM DATA MANAGE
[2]  
[Anonymous], 1996, document CUCS-006-96
[3]  
Breast Cancer Data Set, BREAST CANC DAT SET
[4]  
Grira N., 2007, P 24 INT C MACHINE L, P313
[5]  
Han J, 2012, MOR KAUF D, P1
[6]  
Hartigan J. A., 1979, Applied Statistics, V28, P100, DOI 10.2307/2346830
[7]  
Kaufman L., 1987, Statistical Data Analysis Based on the L1-Norm and Related Methods. First International Conference, P405
[8]  
Kaufman L., 1986, Pattern Recognition in Practice II, P425, DOI [10.1016/B978-0-444-87877-9.50039-X, DOI 10.1016/B978-0-444-87877-9.50039-X]
[9]  
Kaufman L., 1990, FINDING GROUPS DATA, DOI DOI 10.1002/9780470316801
[10]  
MacQueen, 1967, BERK S MATH STAT PRO, DOI DOI 10.1007/S11665-016-2173-6