Two-Stage Clustering with k-Means Algorithm

被引:0
|
作者
Salman, Raied [1 ]
Kecman, Vojislav [1 ]
Li, Qi [1 ]
Strack, Robert [1 ]
Test, Erick [1 ]
机构
[1] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA 23284 USA
关键词
Data Mining; Clustering; k-means algorithm; Distance Calculation;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
k-means has recently been recognized as one of the best algorithms for clustering unsupervised data. Since the k-means depends mainly on distance calculation between all data points and the centers then the cost will be high when the size of the dataset is big (for example more than 500MG points). We suggested a two stage algorithm to reduce the cost of calculation for huge datasets. The first stage is fast calculation depending on small portion of the data to produce the best location of the centers. The second stage is the slow calculation in which the initial centers are taken from the first stage. The fast and slow stages are representing the movement of the centers. In the slow stage the whole dataset can be used to get the exact location of the centers. The cost of the calculation of the fast stage is very low due to the small size of the data chosen. The cost of the calculation of the slow stage is also small due to the low number of iterations.
引用
收藏
页码:110 / 122
页数:13
相关论文
共 50 条
  • [1] A Two-Stage Clustering Algorithm based on Improved K-means and Density Peak Clustering
    Xiao, Na
    Zhou, Xu
    Huang, Xin
    Yang, Zhibang
    2019 10TH IEEE INTERNATIONAL CONFERENCE ON BIG KNOWLEDGE (ICBK 2019), 2019, : 296 - 301
  • [2] Two-stage clustering and routing problem by using FCM and K-means with genetic algorithm
    Pekel Ozmen, Ebru
    Kucukdeniz, Tarik
    SIGMA JOURNAL OF ENGINEERING AND NATURAL SCIENCES-SIGMA MUHENDISLIK VE FEN BILIMLERI DERGISI, 2024, 42 (04): : 1030 - 1038
  • [3] A two-stage recommendation algorithm based on K-means clustering in mobile e-commerce
    Zhang, Fuzhi
    Liu, Huilin
    Chao, Jinbo
    Journal of Computational Information Systems, 2010, 6 (10): : 3327 - 3334
  • [4] Enhancing K-means Clustering Performance with a Two-Stage Hybrid Preprocessing Strategy
    Tripathi, Abhishek
    Tiwari, Aruna
    Chaudhari, Narendra S.
    Ratnaparkhe, Milind
    Dwivedi, Rajesh
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024,
  • [5] A two-stage clustering method combining ant colony SOM and K-means
    Department of Industrial Engineering and Management Information, Huafan University, Taipei County, 223, Taiwan
    不详
    J. Inf. Sci. Eng., 2008, 5 (1445-1460):
  • [6] A two-stage clustering method combining ant colony SOM and K-means
    Chi, Sheng-Chai
    Yang, Chih-Chieh
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2008, 24 (05) : 1445 - 1460
  • [7] Research on k-means Clustering Algorithm An Improved k-means Clustering Algorithm
    Shi Na
    Liu Xumin
    Guan Yong
    2010 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI 2010), 2010, : 63 - 67
  • [8] A Method of Two-Stage Clustering with Constraints Using Agglomerative Hierarchical Algorithm and One-Pass K-Means
    Obara, Nobuhiro
    Miyamoto, Sadaaki
    6TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS, AND THE 13TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS, 2012, : 1540 - 1544
  • [9] A Method of Two-Stage Clustering with Constraints Using Agglomerative Hierarchical Algorithm and One-Pass k-Means plus
    Tamura, Yusuke
    Obara, Nobuhiro
    Miyamoto, Sadaaki
    KNOWLEDGE AND SYSTEMS ENGINEERING (KSE 2013), VOL 2, 2014, 245 : 9 - 19
  • [10] Fast Two-Stage Segmentation Based on Local Correntropy-Based K-Means Clustering
    Song, Yangyang
    Xie, Xiaozhen
    2017 IEEE 9TH INTERNATIONAL CONFERENCE ON COMMUNICATION SOFTWARE AND NETWORKS (ICCSN), 2017, : 1317 - 1323