An efficient parallel clustering algorithm for large scale database

被引:0
|
作者
School of Electronic Information, Wuhan University, Wuhan, Hubei, China [1 ]
不详 [2 ]
不详 [3 ]
机构
[1] School of Electronic Information, Wuhan University, Wuhan, Hubei
[2] Hubei Bureau of Surveying and Mapping, Wuhan, Hubei
[3] PRC Education, Intel China Ltd., Shanghai
来源
J. Softw. | 2009年 / 10卷 / 1119-1126期
关键词
Clustering; Parallel pattern; Parallel processing; Performance analysis; SLPP; SLPPCA;
D O I
10.4304/jsw.4.10.1119-1126
中图分类号
学科分类号
摘要
In this paper, we propose a new parallel clustering algorithm, named Stem-Leaf-Point Plot Clustering Algorithm (SLPPCA). SLPPCA tends to produce clusters of different shapes and sizes, and according to our experiments, it can produces clusters more efficiently than traditional methods. SLPPCA can fully exploits the data-parallelism of data objects, and adopts a task decomposition design step to balance the workloads of multi-core processors to achieve a high speedup. We implemented SLPPCA to large scale data base on duo-core processor and quad-core processor based computer separately and analyzed its performance. The experimental results show that the clusters it produced were particularly good either in different density or shapes, furthermore, with the parallel pattern used in SLPPCA on multi-core platform, the speedup was almost linear with the numbers of cores in processor and the number of data points. Moreover, SLPPCA can generate satisfactory cluster number automatically in clustering process. © 2009 Academy Publisher.
引用
收藏
页码:1119 / 1126
页数:7
相关论文
共 50 条
  • [1] An efficient clustering algorithm for partitioning parallel programs
    Maheshwari, P
    Shen, H
    PARALLEL COMPUTING, 1998, 24 (5-6) : 893 - 909
  • [2] An Efficient Parallel Algorithm for Large Scale Hydrothermal System Operation Planning
    Pinto, Roberto J.
    Borges, Carmen L. T.
    Maceira, Maria E. P.
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2013, 28 (04) : 4888 - 4896
  • [3] An Energy-Efficient Clustering Algorithm for Large Scale Wireless Sensor Networks
    Soleimani, Maryam
    Sharifian, Amirali
    Fanian, Ali
    2013 21ST IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2013,
  • [4] A distributed approximate nearest neighbors algorithm for efficient large scale mean shift clustering
    Beck, Gael
    Duong, Tarn
    Lebbah, Mustapha
    Azzag, Hanane
    Cerin, Christophe
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2019, 134 : 128 - 139
  • [5] An efficient parallel direction-based clustering algorithm
    Zhong, Kai
    Zhou, Xu
    Zhou, Liqian
    Yang, Zhibang
    Liu, Chubo
    Xiao, Na
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2020, 145 : 24 - 33
  • [6] Efficient Large Scale Clustering based on Data Partitioning
    Bendechache, Malika
    Le-Khac, Nhien-An
    Kechadi, M-Tahar
    PROCEEDINGS OF 3RD IEEE/ACM INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS, (DSAA 2016), 2016, : 612 - 621
  • [7] A Parallel Local Search Algorithm for Clustering Large Biological Networks
    Coccimiglio G.
    Choudhury S.
    1600, World Scientific (27): : 3 - 4
  • [8] A parallel fuzzy clustering algorithm for large graphs using Pregel
    Bhatia, Vandana
    Rani, Rinkle
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 78 : 135 - 144
  • [9] PDBSCAN: Parallel DBSCAN for Large-Scale Clustering Applications
    谢永红
    马延辉
    周芳
    刘颖安
    Journal of Donghua University(English Edition), 2012, 29 (01) : 76 - 79
  • [10] Parallel Clustering Algorithm for Large Data Sets with Applications in Bioinformatics
    Olman, Victor
    Mao, Fenglou
    Wu, Hongwei
    Xu, Ying
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2009, 6 (02) : 344 - 352