Optimized gravitational-based data clustering algorithm

被引:15
|
作者
Alswaitti, Mohammed [1 ]
Ishak, Mohamad Khairi [1 ]
Isa, Nor Ashidi Mat [1 ]
机构
[1] Univ Sains Malaysia, Sch Elect & Elect Engn, Engn Campus, Nibong Tebal 14300, Penang, Malaysia
关键词
Gravitational clustering; Centroid initialization; Nature-inspired algorithms; Exploitation and exploration balance; Clustering analysis; SYSTEMS;
D O I
10.1016/j.engappai.2018.05.004
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Gravitational clustering is a nature-inspired and heuristic-based technique. The performance of nature-inspired algorithms relies on the balance achieved between exploitation and exploration. A modification over a data clustering algorithm based on the universal gravity rule is proposed in this paper. Although gravitational clustering algorithm has a high exploration ability, it lacks a proper exploitation mechanism because of the impulsive velocity of agents that search the solution space, which leads to the huge step size of agent positions through iterations. This study proposes the following solutions to impose a balance between exploitation and exploration: (i) the dependence of the agent on velocity history is removed to avoid high velocity caused by accumulating previous velocities, and (ii) an initialization step of centroid positions is added using the variance and median initialization method with a predefined number of clusters. The initialization step eliminates the effects of random initialization and subrogates the exploration process. Experiments are conducted using 13 benchmark datasets from the UCI machine learning repository. In addition, the proposed algorithm is tested on two case studies using the electrical hotspots and cervical cell datasets. The performance of the proposed clustering algorithm is compared qualitatively and quantitatively with several state-of-the-art clustering algorithms. The obtained results indicate that the proposed clustering algorithm outperforms conventional techniques. Furthermore, the clusters obtained using the proposed algorithm are more homogeneous than those obtained using conventional techniques. The proposed algorithm quantitatively achieves better results than the other techniques in 9 out of 15 datasets in terms of accuracy, F-score, and purity.
引用
收藏
页码:126 / 148
页数:23
相关论文
共 50 条
  • [31] Visualization of financial data based on SOM and gravitational field clustering
    Liu, Fang
    Tian, Kai
    Zhou, Zhiguang
    Lin, Hai
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2012, 24 (04): : 435 - 442
  • [32] An efficient three-way clustering algorithm based on gravitational search
    Yu, Hong
    Chang, Zhihua
    Wang, Guoyin
    Chen, Xiaofang
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (05) : 1003 - 1016
  • [33] An efficient three-way clustering algorithm based on gravitational search
    Hong Yu
    Zhihua Chang
    Guoyin Wang
    Xiaofang Chen
    International Journal of Machine Learning and Cybernetics, 2020, 11 : 1003 - 1016
  • [34] An Optimized Farthest First Clustering Algorithm
    Sharmila
    Kumar, Mukesh
    2013 4TH NIRMA UNIVERSITY INTERNATIONAL CONFERENCE ON ENGINEERING (NUICONE 2013), 2013,
  • [35] A Density Clustering Algorithm Based on Data Partitioning
    Li, Dongping
    PROCEEDINGS OF ANNUAL CONFERENCE OF CHINA INSTITUTE OF COMMUNICATIONS, 2010, : 251 - 254
  • [36] Study of clustering algorithm based on model data
    Li, Kai
    Cui, Li-Juan
    PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 3961 - +
  • [37] A new algorithm based on metaheuristics for data clustering
    Tsutomu Shohdohji
    Fumihiko Yano
    Yoshiaki Toyoda
    Journal of Zhejiang University-SCIENCE A, 2010, 11 : 921 - 926
  • [38] Clustering based Compress Data Cube algorithm
    Xie, Zhijun
    Nie, Mingxing
    Wang, Tongsen
    2009 WRI WORLD CONGRESS ON SOFTWARE ENGINEERING, VOL 4, PROCEEDINGS, 2009, : 429 - 433
  • [39] A new algorithm based on metaheuristics for data clustering
    Tsutomu SHOHDOHJI
    Fumihiko YANO
    Yoshiaki TOYODA
    Journal of Zhejiang University-Science A(Applied Physics & Engineering), 2010, (12) : 921 - 926
  • [40] A new algorithm based on metaheuristics for data clustering
    Shohdohji, Tsutomu
    Yano, Fumihiko
    Toyoda, Yoshiaki
    JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE A, 2010, 11 (12): : 921 - 926