An OpenCL Candidate Slicing Frequent Pattern Mining Algorithm on Graphic Processing Units

被引:0
作者
Lin, Che-Yu [2 ]
Yu, Kun-Ming [2 ]
Ouyang, Wen [2 ]
Zhou, Jiayi [1 ]
机构
[1] Natl Tsing Hua Univ, Dept Comp Sci, Hsinchu 300, Taiwan
[2] Chung Hua Univ, Dept Comp Sci & Informat Engn, Hsinchu 300, Taiwan
来源
2011 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC) | 2011年
关键词
frequent pattern mining; parallel processing; graphic processing unit (GPU); OpenCL;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Frequent pattern mining (FPM) is important in data mining field with Apriori algorithm to be one of the commonly used approaches to solve it. However, Aprori algorithm encounters an issue that the computation time increases dramatically when data size increases and when the threshold is small. Many parallel algorithms have been proposed to speed up the computation using computer clusters or grid systems. GPUs have also been applied on FPM with only few adopting OpenCL although OpenCL has the advantage of being platform independent. Thus, the aim of this research is to develop efficient parallel Aprori strategy using GPU and OpenCL. Our novel method, Candidate Slicing Frequent Pattern Mining (CSFPM) algorithm, improves over the previous method by slicing candidate information to better balance the load between processing units. This strategy is proved to be more efficient according to our experiments. For example, CSFPM is at most 2.6 times faster than the previous method. Therefore, CSFPM is an efficient parallel Apriori algorithm which can reduce computation time and improve overall performance.
引用
收藏
页码:2344 / 2349
页数:6
相关论文
共 16 条
[1]   Implementation of Association Rule Mining using CUDA [J].
Adil, Syed Hasan ;
Qamar, Sadaf .
ICET: 2009 INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES, PROCEEDINGS, 2009, :332-+
[2]  
Agrawal R., 2009, Quest synthetic data generator. ibm almaden research center, san jose
[3]  
Agrawal R., P 20 INT C VERY LARG
[4]  
[Anonymous], OPENCL
[5]  
ATI, STREAM SDK
[6]  
Bohm Christian, 2009, T LARGE SCALE DATA K
[7]  
Brin S., 1997, SIGMOD Record, V26, P255, DOI [10.1145/253262.253327, 10.1145/253262.253325]
[8]  
Fang Wenbin, 2008, PARALLEL DATA MINING
[9]   Mining frequent patterns without candidate generation: A frequent-pattern tree approach [J].
Han, JW ;
Pei, J ;
Yin, YW ;
Mao, RY .
DATA MINING AND KNOWLEDGE DISCOVERY, 2004, 8 (01) :53-87
[10]   Frequent pattern mining on message passing multiprocessor systems [J].
Javed, A ;
Khokhar, A .
DISTRIBUTED AND PARALLEL DATABASES, 2004, 16 (03) :321-334