Parallel K-Medoids Improved Algorithm Based on MapReduce

被引:3
|
作者
Zhao, Yonghan [1 ]
Chen, Bin [1 ]
Li, Mengyu [1 ]
机构
[1] Yangzhou Univ, Sch Informat Engn, Yangzhou, Jiangsu, Peoples R China
来源
2018 SIXTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD) | 2018年
关键词
K-Medoids; Canopy algorithm; Max-Min distance algorithm; MapReduce;
D O I
10.1109/CBD.2018.00013
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
K-Medoids algorithm is a partition-based algorithm, which has the characteristics of simple implementation, strong robustness, and high accuracy. However, it has some disadvantages, such as strong dependence on the selection of initial center, the unknown number of classification K, high resource cost of frequent iteration of the algorithm, and poor clustering effect for mass data. In order to solve these problems, the original K-Medoids algorithm was improved by introducing the Canopy algorithm and the Max-Min distance algorithm, and K points were selected as the initial center of the cluster. In the era of big data, we use the MapReduce computing framework to parallefize the algorithm. The experimental results show that: the improved clustering algorithm not only has a good speedup, but also improves the clustering accuracy and convergence, and shows a large performance advantage in dealing with large-scale data.
引用
收藏
页码:18 / 23
页数:6
相关论文
共 50 条
  • [1] A Parallel K-Medoids Algorithm for Clustering based on MapReduce
    Shafiq, M. Omair
    Torunski, Eric
    2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), 2016, : 502 - 507
  • [2] Parallel K-Medoids Clustering Algorithm Based on Hadoop
    Jiang, Yaobin
    Zhang, Jiongmin
    2014 5TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2014, : 649 - 652
  • [3] K-medoids Clustering Based on MapReduce and Optimal Search of Medoids
    Zhu, Ying-ting
    Wang, Fu-zhang
    Shan, Xing-hua
    Lv, Xiao-yan
    2014 PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION (ICCSE 2014), 2014, : 573 - 577
  • [4] An improved K-medoids algorithm based on step increasing and optimizing medoids
    Yu, Donghua
    Liu, Guojun
    Guo, Maozu
    Liu, Xiaoyan
    EXPERT SYSTEMS WITH APPLICATIONS, 2018, 92 : 464 - 473
  • [5] An improved k-medoids clustering algorithm
    Cao, Danyang
    Yang, Bingru
    2010 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND AUTOMATION ENGINEERING (ICCAE 2010), VOL 3, 2010, : 132 - 135
  • [6] Improved K-medoids algorithm based on genetic simulated annealing algorithm
    Han, Xiao
    Liu, Shu-Fen
    Xu, Tian-Qi
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2015, 45 (02): : 619 - 623
  • [7] An Efficient Density based Improved K-Medoids Clustering algorithm
    Pratap, Raghuvira A.
    Vani, K. Suvarna
    Devi, J. Rama
    Rao, K. Nageswara
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2011, 2 (06) : 49 - 54
  • [8] An Improved K-medoids Algorithm Based on Binary Sequences Similarity Measures
    Alalyan, Fandah
    Zamzami, Nuha
    Amayri, Manar
    Bouguila, Nizar
    2019 6TH INTERNATIONAL CONFERENCE ON CONTROL, DECISION AND INFORMATION TECHNOLOGIES (CODIT 2019), 2019, : 1723 - 1728
  • [9] Ocean Data Anomaly Detection Algorithm Based on Improved k-medoids
    Jiang Hua
    Wu Yao
    Lyu Kuilin
    Wang Huijiao
    2019 ELEVENTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI 2019), 2019, : 196 - 201
  • [10] An Improved Ranked K-medoids Clustering Algorithm Based on a P System
    Zhang, Bao
    Xiang, Laisheng
    Liu, Xiyu
    HUMAN CENTERED COMPUTING, HCC 2017, 2018, 10745 : 102 - 107