A hybrid approach for data clustering based on modified cohort intelligence and K-means

被引:55
作者
Krishnasamy, Ganesh [1 ]
Kulkarni, Anand J. [2 ]
Paramesran, Raveendran [1 ]
机构
[1] Univ Malaya, Fac Engn, Dept Elect Engn, Kuala Lumpur 50603, Malaysia
[2] Univ Windsor, Odette Sch Business, Windsor, ON N9B 3P4, Canada
关键词
Clustering; Cohort intelligence; Meta-heuristic algorithm; PARTICLE SWARM OPTIMIZATION; INFORMATION-RETRIEVAL; GENETIC ALGORITHM; COLONY APPROACH; MODEL; RECOGNITION; IMPROVE; PATTERN;
D O I
10.1016/j.eswa.2014.03.021
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is an important and popular technique in data mining. It partitions a set of objects in such a manner that objects in the same clusters are more similar to each another than objects in the different cluster according to certain predefined criteria. K-means is simple yet an efficient method used in data clustering. However, K-means has a tendency to converge to local optima and depends on initial value of cluster centers. In the past, many heuristic algorithms have been introduced to overcome this local optima problem. Nevertheless, these algorithms too suffer several short-comings. In this paper, we present an efficient hybrid evolutionary data clustering algorithm referred to as K-MCI, whereby, we combine K-means with modified cohort intelligence. Our proposed algorithm is tested on several standard data sets from UCI Machine Learning Repository and its performance is compared with other well-known algorithms such as K-means, K-means++, cohort intelligence (CI), modified cohort intelligence (MCI), genetic algorithm (GA), simulated annealing (SA), tabu search (TS), ant colony optimization (ACO), honey bee mating optimization (HBMO) and particle swarm optimization (PSO). The simulation results are very promising in the terms of quality of solution and convergence speed of algorithm. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:6009 / 6016
页数:8
相关论文
共 47 条
[1]   Development a new mutation operator to solve the Traveling Salesman Problem by aid of Genetic Algorithms [J].
Albayrak, Murat ;
Allahverdi, Novruz .
EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (03) :1313-1320
[2]   Application of machine learning techniques to analyse student interactions and improve the collaboration process [J].
Anaya, Antonio R. ;
Boticario, Jesus G. .
EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (02) :1171-1181
[3]  
[Anonymous], 2005, Wiley series in probability and statistics
[4]  
[Anonymous], 2005, Data Mining: Concepts and Techniques
[5]  
Arthur D, 2007, PROCEEDINGS OF THE EIGHTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P1027
[6]  
Bache K., 2013, UCI Machine Learning Repository
[7]   Long distance bigram models applied to word clustering [J].
Bassiou, Nikoletta ;
Kotropoulos, Constantine .
PATTERN RECOGNITION, 2011, 44 (01) :145-158
[8]   Cuckoo search algorithm and wind driven optimization based study of satellite image segmentation for multilevel thresholding using Kapur's entropy [J].
Bhandari, Ashish Kumar ;
Singh, Vineet Kumar ;
Kumar, Anil ;
Singh, Girish Kumar .
EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (07) :3538-3560
[9]   Average correlation clustering algorithm (ACCA) for grouping of co-regulated genes with similar pattern of variation in their expression values [J].
Bhattacharya, Anindya ;
De, Rajat K. .
JOURNAL OF BIOMEDICAL INFORMATICS, 2010, 43 (04) :560-568
[10]   Web usage mining to improve the design of an e-commerce website: OrOliveSur.com [J].
Carmona, C. J. ;
Ramirez-Gallego, S. ;
Torres, F. ;
Bernal, E. ;
del Jesus, M. J. ;
Garcia, S. .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (12) :11243-11249