Automatic centroid initialization in k-means using artificial hummingbird algorithm

被引:0
作者
Kusum Preeti [1 ]
undefined Deep [2 ]
机构
[1] Department of Mathematics, Indian Institute of Technology Roorkee, Uttarakhand, Roorkee
[2] The University of Tennessee Health Science Centre, Memphis, 38163, TN
关键词
Clustering analysis; Data clustering; K-means; Nature inspired algorithm;
D O I
10.1007/s00521-024-10764-4
中图分类号
学科分类号
摘要
K-means is a widely used technique that heavily relies on the initial cluster centroid location. Poorly chosen centroids can cause the algorithm to get trapped in suboptimal solutions. Additionally, determining the optimal number of clusters for large datasets is computationally expensive. To address these challenges, a recently developed Artificial Hummingbird Algorithm (AHA) is used to initialize cluster centroid locations and automatically determine the best estimate for the number of clusters. AHA simulates the specialized flight skills and intelligent foraging strategies of hummingbirds, striking a fine balance between exploration and exploitation during the search process. Unlike other data clustering approaches that use a fixed threshold in heuristic methods, we propose a dynamic threshold based on the variance of the data with respect to its centroids for activating cluster centroids in AHA. The data are automatically partitioned into k cluster centroids such that cohesion, measured by cluster diameters, and separation, measured by nearest neighbor distance, are optimized. The algorithm is tested on various datasets, including real-world data, fundamental clustering benchmarks, synthetic data, and high-dimensional data. To evaluate performance, metrics such as fitness value, inter-cluster distance, and intra-cluster distance were used. Results indicate that the proposed method ranked first and achieved superior clustering performance compared to state-of-the-art algorithms. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.
引用
收藏
页码:3373 / 3398
页数:25
相关论文
共 50 条
  • [21] A k-means based clustering algorithm
    Bloisi, Domenico Daniele
    Locchi, Luca
    COMPUTER VISION SYSTEMS, PROCEEDINGS, 2008, 5008 : 109 - 118
  • [22] K and starting means for k-means algorithm
    Fahim, Ahmed
    JOURNAL OF COMPUTATIONAL SCIENCE, 2021, 55
  • [23] Pattern Discovery Using K-Means Algorithm
    Ahmed, Almahdi Mohammed
    Norwawi, Norita Md
    Ishak, Wan Hussain Wan
    Alkilany, Ahmed
    2014 WORLD CONGRESS ON COMPUTER APPLICATIONS AND INFORMATION SYSTEMS (WCCAIS), 2014,
  • [24] Automatic shape design of double-arch dams using k-means algorithm
    Enrico Zacchei
    José Luis Molina
    Arabian Journal of Geosciences, 2025, 18 (4)
  • [25] Automatic software testing target path selection using k-means clustering algorithm
    Zhang Y.
    Qiao L.
    Wang X.
    Cai J.
    Liu X.
    International Journal of Performability Engineering, 2019, 15 (10) : 2667 - 2674
  • [26] A Novel Hybrid Data Clustering Algorithm Based on Artificial Bee Colony Algorithm and K-Means
    TRAN Dang Cong
    WU Zhijian
    WANG Zelin
    DENG Changshou
    Chinese Journal of Electronics, 2015, 24 (04) : 694 - 701
  • [27] A Novel Hybrid Data Clustering Algorithm Based on Artificial Bee Colony Algorithm and K-Means
    Tran Dang Cong
    Wu Zhijian
    Wang Zelin
    Deng Changshou
    CHINESE JOURNAL OF ELECTRONICS, 2015, 24 (04) : 694 - 701
  • [28] Improving K-Means Through Better Initialization And Normalization
    Choudhary, Akanksha
    Sharma, Prashant
    Singh, Manoj
    2016 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2016, : 2415 - 2419
  • [29] An Extensive Empirical Comparison of k-means Initialization Algorithms
    Harris, Simon
    De Amorim, Renato Cordeiro
    IEEE ACCESS, 2022, 10 : 58752 - 58768
  • [30] A hybrid approach using genetic algorithm and the differential evolution heuristic for enhanced initialization of the k-means algorithm with applications in text clustering
    Mustafi, D.
    Sahoo, G.
    SOFT COMPUTING, 2019, 23 (15) : 6361 - 6378