An efficient parallel clustering algorithm for large scale database

被引:0
|
作者
School of Electronic Information, Wuhan University, Wuhan, Hubei, China [1 ]
不详 [2 ]
不详 [3 ]
机构
[1] School of Electronic Information, Wuhan University, Wuhan, Hubei
[2] Hubei Bureau of Surveying and Mapping, Wuhan, Hubei
[3] PRC Education, Intel China Ltd., Shanghai
来源
J. Softw. | 2009年 / 10卷 / 1119-1126期
关键词
Clustering; Parallel pattern; Parallel processing; Performance analysis; SLPP; SLPPCA;
D O I
10.4304/jsw.4.10.1119-1126
中图分类号
学科分类号
摘要
In this paper, we propose a new parallel clustering algorithm, named Stem-Leaf-Point Plot Clustering Algorithm (SLPPCA). SLPPCA tends to produce clusters of different shapes and sizes, and according to our experiments, it can produces clusters more efficiently than traditional methods. SLPPCA can fully exploits the data-parallelism of data objects, and adopts a task decomposition design step to balance the workloads of multi-core processors to achieve a high speedup. We implemented SLPPCA to large scale data base on duo-core processor and quad-core processor based computer separately and analyzed its performance. The experimental results show that the clusters it produced were particularly good either in different density or shapes, furthermore, with the parallel pattern used in SLPPCA on multi-core platform, the speedup was almost linear with the numbers of cores in processor and the number of data points. Moreover, SLPPCA can generate satisfactory cluster number automatically in clustering process. © 2009 Academy Publisher.
引用
收藏
页码:1119 / 1126
页数:7
相关论文
共 50 条
  • [41] A novel approach based on bio-inspired efficient clustering algorithm for large-scale heterogeneous wireless sensor networks
    Lohar, Lokesh
    Agrawal, Navneet Kumar
    Gupta, Prateek
    Kumar, Manoj
    Sharma, Ajay Kumar
    INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, 2023, 36 (08)
  • [42] The Research on Large Scale Data Set Clustering Algorithm Based on Tag Set
    Chen, Qiang
    COMPUTATIONAL INTELLIGENCE AND INTELLIGENT SYSTEMS, (ISICA 2015), 2016, 575 : 365 - 372
  • [43] Scalable and Memory-Efficient Clustering of Large-Scale Social Networks
    Whang, Joyce Jiyoung
    Sui, Xin
    Dhillon, Inderjit S.
    12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2012), 2012, : 705 - 714
  • [44] An improved density biased sampling algorithm for clustering large-scale datasets
    Sheng, K. (shengkaiyuan1991@163.com), 1600, Binary Information Press (11): : 2355 - 2364
  • [45] DPM: Fast and scalable Clustering Algorithm for Large Scale High Dimensional Datasets
    Ghanem, Tamer F.
    Elkilani, Wail S.
    Ahmed, Hatem S.
    Hadhoud, Mohiy M.
    2014 9TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS (ICCES), 2014, : 71 - 79
  • [46] Parallel genetic algorithm for constrained clustering
    Han, MM
    Tatsumi, S
    Kitamura, Y
    Okumoto, T
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1997, E80A (02) : 416 - 422
  • [47] DPM: Fast and scalable Clustering Algorithm for Large Scale High Dimensional Datasets
    Ghanem, Tamer F.
    Elkilani, Wail S.
    Ahmed, Hatem S.
    Hadhoud, Mohiy M.
    2014 10TH INTERNATIONAL COMPUTER ENGINEERING CONFERENCE (ICENCO), 2014, : 26 - 35
  • [48] Efficient System for Clustering of Dynamic Document Database
    Foszner, Pawel
    Gruca, Aleksandra
    Polanski, Andrzej
    COOPERATIVE DESIGN, VISUALIZATION, AND ENGINEERING (CDVE), 2011, 6874 : 186 - 189
  • [49] Parallel domain decomposition based algorithm for large scale color image denoising
    Fu, Haiwei
    Chen, Rongliang
    Chen, Rongmin
    2014 4TH IEEE INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2014, : 299 - 303
  • [50] A Parallel Branch and Bound Algorithm for Solving Large Scale Integer Programming Problems
    Ismail, Mahmoud M.
    Abd el-Raoof, Osama
    Abd El-Wahed, Waiel F.
    APPLIED MATHEMATICS & INFORMATION SCIENCES, 2014, 8 (04): : 1691 - 1698