Parallel Implementation of a Density-Based Stream Clustering Algorithm Over a GPU Scheduling System

被引:1
作者
Hassani, Marwan [1 ]
Tarakji, Ayman [2 ]
Georgiev, Lyubomir [1 ,2 ]
Seidl, Thomas [1 ]
机构
[1] Rhein Westfal TH Aachen, Data Management & Data Explorat Grp, Aachen, Germany
[2] Rhein Westfal TH Aachen, Chair Operating Syst, Aachen, Germany
来源
TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING | 2014年 / 8643卷
关键词
GPGPU; Stream clustering over GPU; Parallel stream mining; Density-based stream clustering; G-DenStream;
D O I
10.1007/978-3-319-13186-3_40
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Graphics Processing Units (GPUs) are used together with the CPU to accelerate a wide range of general purpose applications or scientific computations. The highly parallel architecture of the GPU consists of hundreds of cores optimized for parallel performance. Applications taking benefit of the GPU architecture have to be implemented according to the GPU parallel concept. An algorithm which follows a sequential work flow, has to be redesigned to achieve good performance on the GPU device. DenStream is a recent stream clustering algorithm that consists of two main parts. The online part summarizes data from the data stream, and builds micro clusters, while the offline part generates the final clustering using density-based clustering. In this work, we present a GPU-based efficient implementation of DenStream called (G-DenStream). G-DenStream is faster than DenStream, especially when the dimensionality of the streaming dataset increases, while keeping the quality of the reflected clustering as it is. The implementations in this work achieve palatalization of both online and offline parts and test the performance and the utilization on the GPU.
引用
收藏
页码:441 / 453
页数:13
相关论文
共 14 条
  • [1] Bohm C, 2009, P 18 ACM C INF KNOWL, P661, DOI DOI 10.1145/1645953.1646038
  • [2] Density-Based Clustering over an Evolving Data Stream with Noise
    Cao, Feng
    Ester, Martin
    Qian, Weining
    Zhou, Aoying
    [J]. PROCEEDINGS OF THE SIXTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2006, : 328 - +
  • [3] Ester M., 2006, KDD 06, P226
  • [4] Iterative statistical kernels on contemporary GPUs
    Gunarathne, Thilina
    Salpitikorala, Bimalee
    Chauhan, Arun
    Fox, Geoffrey
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2013, 8 (01) : 58 - 77
  • [5] Jianbin Fang, 2011, Proceedings of the 2011 IEEE 14th International Conference on Computational Science and Engineering (CSE 2011). 11th International Symposium on Pervasive Systems, Algorithms, Networks (I-SPAN 2011). 10th IEEE International Conference on Ubiquitous Computing and Communications (IUCC 2011), P587, DOI 10.1109/CSE.2011.104
  • [6] Signal processing and general-purpose computing on GPUs
    McCool, Michael D.
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2007, 24 (03) : 109 - 114
  • [7] High-throughput sequence alignment using Graphics Processing Units
    Schatz, Michael C.
    Trapnell, Cole
    Delcher, Arthur L.
    Varshney, Amitabh
    [J]. BMC BIOINFORMATICS, 2007, 8
  • [8] Shalom SAA, 2008, LECT NOTES COMPUT SC, V5182, P166, DOI 10.1007/978-3-540-85836-2_16
  • [9] Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing
    Takizawa, Hiroyuki
    Kobayashi, Hiroaki
    [J]. JOURNAL OF SUPERCOMPUTING, 2006, 36 (03) : 219 - 234
  • [10] Tarakji A, 2013, LECT NOTES COMPUT SC, V7975, P181, DOI 10.1007/978-3-642-39640-3_13