Nature-inspired metaheuristic techniques for automatic clustering: a survey and performance study

被引:40
作者
Ezugwu, Absalom E. [1 ]
机构
[1] Univ KwaZulu Natal, Sch Comp Sci, King Edward Rd,Pietermaritzburg Campus, ZA-3201 Pietermaritzburg, Kwazulu Natal, South Africa
来源
SN APPLIED SCIENCES | 2020年 / 2卷 / 02期
关键词
Automatic clustering; Clustering analysis; Differential evolution; Particle swarm optimization; Firefly algorithm; Invasive weed optimization; Artificial bee colony; Bees algorithm; Biogeography-based optimization; Harmony search; Symbiotic organisms search; Teaching-learning-based optimization; DB validity index; CS validity index; PARTICLE SWARM OPTIMIZATION; INVASIVE WEED OPTIMIZATION; BEE COLONY OPTIMIZATION; DIFFERENTIAL EVOLUTION; SEARCH ALGORITHM; GENETIC ALGORITHMS; FIREFLY ALGORITHM;
D O I
10.1007/s42452-020-2073-0
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The application of several swarm intelligence and evolutionary metaheuristic algorithms in data clustering problems has in the past few decades gained wide popularity and acceptance due to their success in solving and finding good quality solutions to a variety of complex real-world optimization problems. Clustering is considered one of the most important data analysis techniques in the domain of data mining. A clustering problem refers to the partitioning of unlabeled data objects into a certain number of clusters based on their attribute values or features, with the objective of maximizing intra-clusters homogeneity and inter-cluster heterogeneity. This paper presents an up-to-date survey of major nature-inspired metaheuristic algorithms that have been employed to solve automatic clustering problems. Further, a comparative study of several modified well-known global metaheuristic algorithms is carried out to solve automatic clustering problems. Also, three hybrid swarm intelligence and evolutionary algorithms, namely, particle swarm differential evolution algorithm, firefly differential evolution algorithm and invasive weed optimization differential evolution algorithm, are proposed to deal with the task of automatic data clustering. In contrast to many of the existing traditional and evolutionary computational clustering techniques, the clustering algorithms presented in this paper do not require any predetermined information or prior-knowledge of the dataset that is to be classified, but rather they are capable of spontaneously identifying the optimal number of partitions of the data points during the course of program execution. Forty-one benchmarked datasets that comprise eleven artificial and thirty real world datasets are collated and utilized to evaluate the performances of the representative nature-inspired clustering algorithms. According to the extensive experimental results, comparisons and statistical significance, the firefly algorithm appeared to be more appropriate for better clustering of both low and high dimensional data objects than were other state-of-the-art algorithms. Further, an experimental study demonstrates the superiority of the three proposed hybrid algorithms over the standard state-of-the-art methods in finding meaningful clustering solutions to the problem at hand.
引用
收藏
页数:57
相关论文
共 107 条
  • [1] Abd Elaziz M, 2019, IEEE C EVOL COMPUTAT, P2315, DOI [10.1109/CEC.2019.8790361, 10.1109/cec.2019.8790361]
  • [2] Abubaker A, 2015, PLOS ONE, V10, DOI [10.1371/journal.pone.0130995, 10.1371/journal.pone.0135641]
  • [3] Automatic subspace clustering of high dimensional data
    Agrawal, R
    Gehrke, J
    Gunopulos, D
    Raghavan, P
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2005, 11 (01) : 5 - 33
  • [4] A novel combinatorial merge-split approach for automatic clustering using imperialist competitive algorithm
    Aliniya, Zahra
    Mirroshandel, Seyed Abolghasem
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 117 : 243 - 266
  • [5] Automatic data clustering using continuous action-set learning automata and its application in segmentation of images
    Anari, B.
    Torkestani, J. Akbari
    Rahmani, A. M.
    [J]. APPLIED SOFT COMPUTING, 2017, 51 : 253 - 265
  • [6] [Anonymous], 1988, ALGORITHMS CLUSTERIN
  • [7] [Anonymous], 2012, ARXIV12051117
  • [8] [Anonymous], J TONGJI U NAT SCI
  • [9] [Anonymous], 2010, ENERG SCI ENG TECH
  • [10] [Anonymous], INT C MACH LEARN COM