A general-purpose distributed pattern mining system

被引:20
作者
Belhadi, Asma [1 ]
Djenouri, Youcef [2 ,3 ]
Lin, Jerry Chun-Wei [4 ]
Cano, Alberto [5 ]
机构
[1] USTHB, Dept Comp Sci, Algiers, Algeria
[2] NTNU, Dept Comp Sci, Trondheim, Norway
[3] SINTEF Digital, Forskningsveien 1, N-0314 Oslo, Norway
[4] Western Norway Univ Appl Sci HVL, Dept Comp Math & Phys, Bergen, Norway
[5] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA USA
关键词
Pattern mining; Decomposition; Distributed computing; Heterogeneous architecture; FREQUENT ITEMSETS; PARALLEL; ALGORITHM; DISCOVERY;
D O I
10.1007/s10489-020-01664-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper explores five pattern mining problems and proposes a new distributed framework called DT-DPM: Decomposition Transaction for Distributed Pattern Mining. DT-DPM addresses the limitations of the existing pattern mining problems by reducing the enumeration search space. Thus, it derives the relevant patterns by studying the different correlation among the transactions. It first decomposes the set of transactions into several clusters of different sizes, and then explores heterogeneous architectures, including MapReduce, single CPU, and multi CPU, based on the densities of each subset of transactions. To evaluate the DT-DPM framework, extensive experiments were carried out by solving five pattern mining problems (FIM: Frequent Itemset Mining, WIM: Weighted Itemset Mining, UIM: Uncertain Itemset Mining, HUIM: High Utility Itemset Mining, and SPM: Sequential Pattern Mining). Experimental results reveal that by using DT-DPM, the scalability of the pattern mining algorithms was improved on large databases. Results also reveal that DT-DPM outperforms the baseline parallel pattern mining algorithms on big databases.
引用
收藏
页码:2647 / 2662
页数:16
相关论文
共 78 条
[1]  
Aggarwal C. C., 2014, Frequent Pattern Mining, DOI DOI 10.1007/978-3-319-07821-2
[2]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[3]   Efficient spatial co-location pattern mining on multiple GPUs [J].
Andrzejewski, W. ;
Boinski, P. .
EXPERT SYSTEMS WITH APPLICATIONS, 2018, 93 :465-483
[4]   Parallel approach to incremental co-location pattern mining [J].
Andrzejewski, Witold ;
Boinski, Pawel .
INFORMATION SCIENCES, 2019, 496 :485-505
[5]   PaWI: Parallel Weighted Itemset Mining by means of MapReduce [J].
Baralis, Elena ;
Cagliero, Luca ;
Garza, Paolo ;
Grimaudo, Luigi .
2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015, 2015, :25-32
[6]  
Belhadi A., 2020, IEEE T CYBERNETICS, P1
[7]   Exploring Pattern Mining Algorithms for Hashtag Retrieval Problem [J].
Belhadi, Asma ;
Djenouri, Youcef ;
Lin, Jerry Chun-Wei ;
Zhang, Chongsheng ;
Cano, Alberto .
IEEE ACCESS, 2020, 8 :10569-10583
[8]  
Belhadi H., APPL INTELL, P1
[10]  
Chan R, 2003, THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, P19