Unsupervised K-Means Clustering Algorithm

被引:1059
作者
Sinaga, Kristina P. [1 ]
Yang, Miin-Shen [1 ]
机构
[1] Chung Yuan Christian Univ, Dept Appl Math, Taoyuan 32023, Taiwan
关键词
Clustering algorithms; Indexes; Linear programming; Entropy; Clustering methods; Unsupervised learning; Machine learning algorithms; Clustering; K-means; number of clusters; initializations; unsupervised learning schema; Unsupervised k-means (U-k-means); VALIDATION; INFORMATION; SELECTION; NUMBER; EM;
D O I
10.1109/ACCESS.2020.2988796
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The k-means algorithm is generally the most known and used clustering method. There are various extensions of k-means to be proposed in the literature. Although it is an unsupervised learning to clustering in pattern recognition and machine learning, the k-means algorithm and its extensions are always influenced by initializations with a necessary number of clusters a priori. That is, the k-means algorithm is not exactly an unsupervised clustering method. In this paper, we construct an unsupervised learning schema for the k-means algorithm so that it is free of initializations without parameter selection and can also simultaneously find an optimal number of clusters. That is, we propose a novel unsupervised k-means (U-k-means) clustering algorithm with automatically finding an optimal number of clusters without giving any initialization and parameter selection. The computational complexity of the proposed U-k-means clustering algorithm is also analyzed. Comparisons between the proposed U-k-means and other existing methods are made. Experimental results and comparisons actually demonstrate these good aspects of the proposed U-k-means clustering algorithm.
引用
收藏
页码:80716 / 80727
页数:12
相关论文
共 37 条
[11]   Unsupervised learning of finite mixture models [J].
Figueiredo, MAT ;
Jain, AK .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (03) :381-396
[12]   Cluster Validation Method for Determining the Number of Clusters in Categorical Sequences [J].
Guo, Gongde ;
Chen, Lifei ;
Ye, Yanfang ;
Jiang, Qingshan .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (12) :2936-2948
[13]   On clustering validation techniques [J].
Halkidi, M ;
Batistakis, Y ;
Vazirgiannis, M .
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2001, 17 (2-3) :107-145
[14]  
Ilc N, 2012, PRZ ELEKTROTECHNICZN, V88, P126
[15]  
Jain A. K., 1988, Algorithms for Clustering Data
[16]   Data clustering: 50 years beyond K-means [J].
Jain, Anil K. .
PATTERN RECOGNITION LETTERS, 2010, 31 (08) :651-666
[17]   BAYES FACTORS [J].
KASS, RE ;
RAFTERY, AE .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1995, 90 (430) :773-795
[18]  
Kaufman L., 1990, FINDING GROUPS DATA, DOI DOI 10.1002/9780470316801
[19]  
Krizhevsky A., 2009, Handbook of Systemic Autoimmune Diseases
[20]   Ground truth bias in external cluster validity indices [J].
Lei, Yang ;
Bezdek, James C. ;
Romano, Simone ;
Nguyen Xuan Vinh ;
Chan, Jeffrey ;
Bailey, James .
PATTERN RECOGNITION, 2017, 65 :58-70