An efficient pattern growth approach for mining fault tolerant frequent itemsets

被引:15
作者
Bashir, Shariq [1 ]
机构
[1] Fdn Univ, Dept Software Engn, Islamabad, Pakistan
关键词
Fault tolerant frequent itemset mining; Frequent itemset mining; Pattern growth; Association rules mining; APPROXIMATE;
D O I
10.1016/j.eswa.2019.113046
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mining fault tolerant (FT) frequent itemsets from transactional databases are computationally more expensive than mining exact matching frequent itemsets. Previous algorithms mine FT frequent itemsets using Apriori heuristic. Apriori-like algorithms generate exponential number of candidate itemsets including the itemsets that do not exist in the database. These algorithms require multiple scans of database for counting the support of candidate FT itemsets. In this paper we present a novel algorithm, which mines FT frequent itemsets using frequent pattern growth approach (FT-PatternGrowth). FT-PatternGrowth adopts a divide-and-conquer technique and recursively projects transactional database into a set of smaller projected transactional databases and mines FT frequent itemsets in each projected database by exploring only locally frequent items. This mines the complete set of FT frequent itemsets and substantially reduces those candidate itemsets that do not exist in the database. FT-PatternGrowth stores the transactional database in a highly condensed much smaller data structure called frequent pattern tree (FP-tree). The support of candidate itemsets are counted directly from the FP-tree without scanning the original database multiple times. This improves the processing speed of algorithm. Our experiments on benchmark databases indicates mining FT frequent itemsets using FT-PatternGrowth is highly efficient than Apriori-like algorithms. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页数:15
相关论文
共 36 条
[1]  
[Anonymous], ACTA POLYTECHNICA HU
[2]  
[Anonymous], 2000, SIGMOD, DOI DOI 10.1145/342009.335372
[3]   Frequent Itemsets Mining for Big Data: A Comparative Analysis [J].
Apiletti, Daniele ;
Baralis, Elena ;
Cerquitelli, Tania ;
Garza, Paolo ;
Pulvirenti, Fabio ;
Venturini, Luca .
BIG DATA RESEARCH, 2017, 9 :67-83
[4]   Mining fault tolerant frequent patterns using pattern growth approach [J].
Bashir, Shariq ;
Halim, Zahid ;
Baig, A. Rauf .
2008 IEEE/ACS INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, VOLS 1-3, 2008, :172-179
[5]   A novel approach for mining maximal frequent patterns [J].
Bay Vo ;
Sang Pham ;
Tuong Le ;
Deng, Zhi-Hong .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 73 :178-186
[6]  
Becquet C, 2002, GENOME BIOL, V3
[7]  
Besson J, 2006, LECT NOTES COMPUT SC, V3933, P55
[8]  
Bodon F., 2003, P ICDM 2003 WORKSH F
[9]   MAFIA: A maximal frequent itemset algorithm [J].
Burdick, D ;
Calimlim, M ;
Flannick, J ;
Gehrke, J ;
Yiu, TM .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (11) :1490-1504
[10]   Mining frequent itemsets without support threshold: With and without item constraints [J].
Cheung, YL ;
Fu, AWC .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2004, 16 (09) :1052-1069