Seed Point Selection Algorithm in Clustering of Image Data

被引:6
作者
Chowdhury, Kuntal [1 ]
Chaudhuri, Debasis [2 ]
Pal, Arup Kumar [3 ]
机构
[1] DIT Univ, Dept Informat Technol, Dehra Dun 248001, Uttar Pradesh, India
[2] DIC DRDO, Bardhaman 713149, W Bengal, India
[3] Indian Sch Mines, Indian Inst Technol, Dept Comp Sci & Engn, Dhanbad 826004, Bihar, India
来源
PROGRESS IN INTELLIGENT COMPUTING TECHNIQUES: THEORY, PRACTICE, AND APPLICATIONS, VOL 2 | 2018年 / 719卷
关键词
Data mining; Clustering; Seed point; Shannon's entropy; K-Means; INITIALIZATION;
D O I
10.1007/978-981-10-3376-6_13
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Massive amount of data are being collected in almost all sectors of life due to recent technological advancements. Various data mining tools including clustering is often applied on huge data sets in order to extract hidden and previously unknown information which can be helpful in future decision-making processes. Clustering is an unsupervised technique of data points which is separated into homogeneous groups. Seed point is an important feature of a clustering technique, which is called the core of the cluster and the performance of seed-based clustering technique depends on the choice of initial cluster center. The initial seed point selection is a challenging job due to formation of better cluster partition with rapidly convergence criteria. In the present research we have proposed the seed point selection algorithm applied on image data by taking the RGB features of color image as well as 2D data based on the maximization of Shannon's entropy with distance restriction criteria. Our seed point selection algorithm converges in a minimum number of steps for the formation of better clusters. We have applied our algorithm in different image data as well as discrete data and the results appear to be satisfactory. Also we have compared the result with other seed selection methods applied through K-Means algorithm for the comparative study of number of iterations and CPU time with the other clustering technique.
引用
收藏
页码:119 / 126
页数:8
相关论文
共 23 条
[1]  
[Anonymous], TECH REP
[2]  
[Anonymous], 1988, Algorithms for Clustering Data
[3]  
[Anonymous], 1970, SPEECH ANAL CLUSTERI
[4]  
[Anonymous], 1974, Pattern recognition principles
[5]   Image segmentation by histogram thresholding using hierarchical cluster analysis [J].
Arifin, Agus Zainal ;
Asano, Akira .
PATTERN RECOGNITION LETTERS, 2006, 27 (13) :1515-1521
[6]   A cluster centers initialization method for clustering categorical data [J].
Bai, Liang ;
Liang, Jiye ;
Dang, Chuangyin ;
Cao, Fuyuan .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (09) :8022-8029
[7]   Divisive Correlation Clustering Algorithm (DCCA) for grouping of genes: detecting varying patterns in expression profiles [J].
Bhattacharya, Anindya ;
De, Rajat K. .
BIOINFORMATICS, 2008, 24 (11) :1359-1366
[8]   An initialization method for the K-Means algorithm using neighborhood model [J].
Cao, Fuyuan ;
Liang, Jiye ;
Jiang, Guang .
COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2009, 58 (03) :474-483
[9]   A comparative study of efficient initialization methods for the k-means clustering algorithm [J].
Celebi, M. Emre ;
Kingravi, Hassan A. ;
Vela, Patricio A. .
EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (01) :200-210
[10]   A novel multiseed nonhierarchical data clustering technique [J].
Chaudhuri, D ;
Chaudhuri, BB .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 1997, 27 (05) :871-877