An Initialization Method Based on Hybrid Distance for k-Means Algorithm

被引:11
|
作者
Yang, Jie [1 ]
Ma, Yan [1 ]
Zhang, Xiangfen [1 ]
Li, Shunbao [2 ]
Zhang, Yuping [1 ]
机构
[1] Shanghai Normal Univ, Coll Informat Mech & Elect Engn, Shanghai 200234, Peoples R China
[2] Shanghai Normal Univ, Coll Math & Sci, Shanghai 200234, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1162/neco_a_01014
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The traditional k-means algorithm has been widely used as a simple and efficient clustering method. However, the performance of this algorithm is highly dependent on the selection of initial cluster centers. Therefore, the method adopted for choosing initial cluster centers is extremely important. In this letter, we redefine the density of points according to the number of its neighbors, as well as the distance between points and their neighbors. In addition, we define a new distance measure that considers both Euclidean distance and density. Based on that, we propose an algorithm for selecting initial cluster centers that can dynamically adjust the weighting parameter. Furthermore, we propose a new internal clustering validation measure, the clustering validation index based on the neighbors (CVN), which can be exploited to select the optimal result among multiple clustering results. Experimental results show that the proposed algorithm outperforms existing initialization methods on real-world data sets and demonstrates the adaptability of the proposed algorithm to data sets with various characteristics.
引用
收藏
页码:3094 / 3117
页数:24
相关论文
共 50 条
  • [21] LeaderRank based k-means clustering initialization method for collaborative filtering
    Kant, Surya
    Mahara, Tripti
    Jain, Vinay Kumar
    Jai, Deepak Kumar
    Sangaiah, Arun Kumar
    COMPUTERS & ELECTRICAL ENGINEERING, 2018, 69 : 598 - 609
  • [22] A Clustering Method Based on K-Means Algorithm
    Li, Youguo
    Wu, Haiyan
    INTERNATIONAL CONFERENCE ON SOLID STATE DEVICES AND MATERIALS SCIENCE, 2012, 25 : 1104 - 1109
  • [23] An Improved K-means Algorithm Based on Weighted Euclidean Distance
    Ge, Fuhua
    Luo, Yi
    2012 THIRD INTERNATIONAL CONFERENCE ON THEORETICAL AND MATHEMATICAL FOUNDATIONS OF COMPUTER SCIENCE (ICTMF 2012), 2013, 38 : 117 - 120
  • [24] Design of K-Means Clustering Algorithm Based on Distance Concentration
    Liu, Tao
    Dai, Guiping
    Zhang, Li
    Wang, Zhijie
    PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON ELECTRONIC COMMERCE AND SECURITY, VOL II, 2009, : 256 - +
  • [25] An empirical comparison of four initialization methods for the K-Means algorithm
    Peña, JM
    Lozano, JA
    Larrañaga, P
    PATTERN RECOGNITION LETTERS, 1999, 20 (10) : 1027 - 1040
  • [26] DETERMINISTIC INITIALIZATION OF THE K-MEANS ALGORITHM USING HIERARCHICAL CLUSTERING
    Celebi, M. Emre
    Kingravi, Hassan A.
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2012, 26 (07)
  • [27] AN EFFICIENT K-MEANS CLUSTERING INITIALIZATION USING OPTIMIZATION ALGORITHM
    Divya, V.
    Deepika, R.
    Yamini, C.
    Sobiyaa, P.
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING & COMMUNICATION ENGINEERING (ICACCE-2019), 2019,
  • [28] K-means algorithm with a novel distance measure
    Abudalfa, Shadi I.
    Mikki, Mohammad
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2013, 21 (06) : 1665 - 1684
  • [29] Improved K-Means Algorithm Based on Hybrid Rice Optimization Algorithm
    Liu, Chuan
    Wang, Chunzhi
    Hu, Jixiong
    Ye, Zhiwei
    PROCEEDINGS OF THE 2017 9TH IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT DATA ACQUISITION AND ADVANCED COMPUTING SYSTEMS: TECHNOLOGY AND APPLICATIONS (IDAACS), VOL 2, 2017, : 788 - 791
  • [30] MST-Based Cluster Initialization for K-Means
    Reddy, Damodar
    Mishra, Devender
    Jana, Prasanta K.
    ADVANCES IN COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, PT I, 2011, 131 : 329 - 338