Seed selection algorithm through K-means on optimal number of clusters

被引：0

作者：

Kuntal Chowdhury

Debasis Chaudhuri

Arup Kumar Pal

Ashok Samal

机构：

[1] Indian Institute of Technology (Indian School of Mines)[IIT(ISM)],Department of Computer Science and Engineering

[2] DRDO Integration Centre Panagarh WestBengal,Department of Computer Science and Engineering

[3] University of Nebraska,undefined

来源：

Multimedia Tools and Applications | 2019年 / 78卷

关键词：

Clustering; Cluster building time; Cluster validity indices; Joint probability; K-means; Seed point; Seed generation time; Segmentation entropy;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Clustering is one of the important unsupervised learning in data mining to group the similar features. The growing point of the cluster is known as a seed. To select the appropriate seed of a cluster is an important criterion of any seed based clustering technique. The performance of seed based algorithms are dependent on initial cluster center selection and the optimal number of clusters in an unknown data set. Cluster quality and an optimal number of clusters are the important issues in cluster analysis. In this paper, the proposed seed point selection algorithm has been applied to 3 band image data and 2D discrete data. This algorithm selects the seed point using the concept of maximization of the joint probability of pixel intensities with the distance restriction criteria. The optimal number of clusters has been decided on the basis of the combination of seven different cluster validity indices. We have also compared the results of our proposed seed selection algorithm on an optimal number of clusters using K-Means clustering with other classical seed selection algorithms applied through K-Means Clustering in terms of seed generation time (SGT), cluster building Time (CBT), segmentation entropy and the number of iterations (NOTK−means). We have also made the analysis of CPU time and no. of iterations of our proposed seed selection method with other clustering algorithms.

引用

页码：18617 / 18651

页数：34

共 96 条

[1]

Al Malki A(2016)Hybrid genetic algorithm with k-means for clustering problems Open J Optim 5 71-186

[2]

Rizk MM(2018)Density-based particle swarm optimization algorithm for data clustering Expert Syst Appl 91 170-1521

[3]

El-Shorbagy M(2006)Image segmentation by histogram thresholding using hierarchical cluster analysis Pattern Recogn Lett 27 1515-8029

[4]

Mousa A(2012)A cluster centers initialization method for clustering categorical data Expert Syst Appl 39 8022-71

[5]

Alswaitti M(2016)Automatic segmentation of bones in x-ray images based on entropy measure Int J Image Graph 16 1650,001-1366

[6]

Albughdadi M(1974)Numerical taxonomy with fuzzy sets J Math Biol 1 57-1322

[7]

Isa NAM(2008)Divisive correlation clustering algorithm (dcca) for grouping of genes: detecting varying patterns in expression profiles Bioinformatics 24 1359-27

[8]

Arifin AZ(2014)Centroids initialization for k-means clustering using improved pillar algorithm Int J Adv Res Comput Eng Technol 3 1317-483

[9]

Asano A(1974)A dendrite method for cluster analysis Commun Stat-Theory Methods 3 1-210

[10]

Bai L(2009)An initialization method for the k-means algorithm using neighborhood model Comput Math Appl 58 474-212

← 1 2 3 4 5 6 7 8 9 10 →