A Comparative Performance Study of Hybrid Firefly Algorithms for Automatic Data Clustering

被引:30
作者
Ezugwu, Absalom El-Shamir [1 ]
Agbaje, Moyinoluwa B. [1 ]
Aljojo, Nahla [2 ]
Els, Rosanne [1 ]
Chiroma, Haruna [3 ]
Abd Elaziz, Mohamed [4 ]
机构
[1] Univ KwaZulu Natal, Sch Comp Sci, Pietermaritzburg Campus, ZA-3201 Pietermaritzburg, South Africa
[2] Univ Jeddah, Dept Informat Syst & Technol, Coll Comp Sci & Engn, Jeddah 23218, Saudi Arabia
[3] Natl Yunlin Univ Sci & Technol, Future Technol Res Ctr, Touliu 64002, Yunlin, Taiwan
[4] Zagazig Univ, Dept Math, Zagazig 14459, Egypt
关键词
Automatic clustering; firefly algorithm; firefly-based hybrid algorithms; clustering validity index; PARTICLE SWARM OPTIMIZATION; BEE COLONY OPTIMIZATION; DIFFERENTIAL EVOLUTION; GENETIC ALGORITHM; SEARCH ALGORITHM; PSO;
D O I
10.1109/ACCESS.2020.3006173
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In cluster analysis, the goal has always been to extemporize the best possible means of automatically determining the number of clusters. However, because of lack of prior domain knowledge and uncertainty associated with data objects characteristics, it is challenging to choose an appropriate number of clusters, especially when dealing with data objects of high dimensions, varying data sizes, and density. In the last few decades, different researchers have proposed and developed several nature-inspired metaheuristic algorithms to solve data clustering problems. Many studies have shown that the firefly algorithm is a very robust, efficient and effective nature-inspired swarm intelligence global search technique, which has been successfully applied to solve diverse NP-hard optimization problems. However, the diversification search process employed by the firefly algorithm can lead to reduced speed and convergence rate for large-scale optimization problems. Thus this study investigates the application of four hybrid firefly algorithms to the task of automatic clustering of high density and large-scaled unlabelled datasets. In contrast to most of the existing classical heuristic-based data clustering analyses techniques, the proposed hybrid algorithms do not require any prior knowledge of the data objects to be classified. Instead, the hybrid methods automatically determine the optimal number of clusters empirically and during the program execution. Two well-known clustering validity indices, namely the Compact-Separated and Davis-Bouldin indices, are employed to evaluate the superiority of the implemented firefly hybrid algorithms. Furthermore, twelve standard ground truth clustering datasets from the UCI Machine Learning Repository are used to evaluate the robustness and effectiveness of the algorithms against those of the classical swarm optimization algorithms and other related clustering results from the literature. The experimental results show that the new clustering methods depict high superiority in comparison with existing standalone and other hybrid metaheuristic techniques in terms of clustering validity measures.
引用
收藏
页码:121089 / 121118
页数:30
相关论文
共 91 条
  • [1] Abd Elaziz M, 2019, IEEE C EVOL COMPUTAT, P2315, DOI [10.1109/CEC.2019.8790361, 10.1109/cec.2019.8790361]
  • [2] Abraham A, 2007, GECCO 2007: GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, VOL 1 AND 2, P2
  • [3] Abubaker A, 2015, PLOS ONE, V10, DOI [10.1371/journal.pone.0130995, 10.1371/journal.pone.0135641]
  • [4] Enhanced flower pollination algorithm on data clustering
    Agarwal P.
    Mehta S.
    [J]. International Journal of Computers and Applications, 2016, 38 (2-3) : 144 - 155
  • [5] Automatic Data Clustering Using Hybrid Firefly Particle Swarm Optimization Algorithm
    Agbaje, Moyinoluwa B.
    Ezugwu, Absalom E.
    Els, Rosanne
    [J]. IEEE ACCESS, 2019, 7 : 184963 - 184984
  • [6] A new grouping genetic algorithm for clustering problems
    Agustin-Blas, L. E.
    Salcedo-Sanz, S.
    Jimenez-Fernandez, S.
    Carro-Calvo, L.
    Del Ser, J.
    Portilla-Figueras, J. A.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (10) : 9695 - 9703
  • [7] A novel combinatorial merge-split approach for automatic clustering using imperialist competitive algorithm
    Aliniya, Zahra
    Mirroshandel, Seyed Abolghasem
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 117 : 243 - 266
  • [8] Automatic data clustering using continuous action-set learning automata and its application in segmentation of images
    Anari, B.
    Torkestani, J. Akbari
    Rahmani, A. M.
    [J]. APPLIED SOFT COMPUTING, 2017, 51 : 253 - 265
  • [9] Anderberg M.R., 2014, Cluster analysis for applications: probability and mathematical statistics: a series of monographs and textbooks, V19
  • [10] [Anonymous], P INT S ART INT SIGN