A general-purpose distributed pattern mining system

被引:0
作者
Asma Belhadi
Youcef Djenouri
Jerry Chun-Wei Lin
Alberto Cano
机构
[1] USTHB,Department of Computer Science
[2] NTNU,Department of Computer Science
[3] SINTEF Digital,Department of Computing, Mathematics and Physics
[4] Western Norway University of Applied Sciences (HVL),Department of Computer Science
[5] Virginia Commonwealth University,undefined
来源
Applied Intelligence | 2020年 / 50卷
关键词
Pattern mining; Decomposition; Distributed computing; Heterogeneous architecture;
D O I
暂无
中图分类号
学科分类号
摘要
This paper explores five pattern mining problems and proposes a new distributed framework called DT-DPM: Decomposition Transaction for Distributed Pattern Mining. DT-DPM addresses the limitations of the existing pattern mining problems by reducing the enumeration search space. Thus, it derives the relevant patterns by studying the different correlation among the transactions. It first decomposes the set of transactions into several clusters of different sizes, and then explores heterogeneous architectures, including MapReduce, single CPU, and multi CPU, based on the densities of each subset of transactions. To evaluate the DT-DPM framework, extensive experiments were carried out by solving five pattern mining problems (FIM: Frequent Itemset Mining, WIM: Weighted Itemset Mining, UIM: Uncertain Itemset Mining, HUIM: High Utility Itemset Mining, and SPM: Sequential Pattern Mining). Experimental results reveal that by using DT-DPM, the scalability of the pattern mining algorithms was improved on large databases. Results also reveal that DT-DPM outperforms the baseline parallel pattern mining algorithms on big databases.
引用
收藏
页码:2647 / 2662
页数:15
相关论文
共 187 条
[1]  
Djenouri Y(2018)Extracting useful knowledge from event logs: a frequent itemset mining approach Knowl-Based Syst 139 132-148
[2]  
Belhadi A(2018)Bees swarm optimization guided by data mining techniques for document information retrieval Expert Syst Appl 94 126-136
[3]  
Fournier-Viger P(2019)Mining conditional discriminative sequential patterns Inf Sci 478 524-539
[4]  
Djenouri Y(2019)Emerging topic detection in twitter stream based on high utility pattern mining Expert Syst Appl 115 27-36
[5]  
Belhadi A(2019)Machine learning for smart building applications: Review and taxonomy ACM Comput Surv (CSUR) 52 24-852
[6]  
Belkebir R(2003)Survey on frequent pattern mining Univ Hels 19 840-986
[7]  
He Z(2010)A taxonomy of sequential pattern mining algorithms ACM Comput Surv (CSUR) 43 3-15
[8]  
Zhang S(2018)A multi-objective evolutionary approach for mining frequent and high utility itemsets Appl Soft Comput 62 974-114
[9]  
Gu F(2017)Combining apriori heuristic and bio-inspired algorithms for solving the frequent itemsets mining problem Inf Sci 420 1-367
[10]  
Wu J(2017)FiDoop-DP: data partitioning in frequent itemset mining on hadoop clusters IEEE Transactions on Parallel and Distributed Systems 28 101-205