Efficient Mining of Frequent Itemsets Using Only One Dynamic Prefix Tree

被引:13
作者
Qu, Jun-Feng [1 ]
Hang, Bo [1 ]
Wu, Zhao [1 ]
Wu, Zhongbo [1 ]
Gu, Qiong [1 ]
Tang, Bo [2 ]
机构
[1] Hubei Univ Arts & Sci, Sch Comp Engn, Xiangyang 441053, Peoples R China
[2] Hubei Univ Arts & Sci, Sch Math & Stat, Xiangyang 441053, Peoples R China
关键词
Itemsets; Data mining; Heuristic algorithms; Lattices; Task analysis; Art; Frequent itemset; post-conditional database; dynamic prefix tree; ASSOCIATION RULES; PATTERNS; APPROXIMATION; GENERATION; MAPREDUCE; ALGORITHM;
D O I
10.1109/ACCESS.2020.3029302
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Frequent itemset mining is a fundamental problem in data mining area because frequent itemsets have been extensively used in reasoning, classifying, clustering, and so on. To mine frequent itemsets, previous algorithms based on a prefix tree structure have to construct many prefix trees, which is very time-consuming. In this paper, we propose a novel frequent itemset mining algorithm called DPT (Dynamic Prefix Tree) which uses only one prefix tree. We first introduce the concept of the post-conditional database of an itemset, and analyze the distribution of an itemsets post-conditional database in a prefix tree representing a database. Subsequently, we illuminate how DPT adjusts the prefix tree to mine frequent itemsets and give three optimization techniques. An interesting advantage of DPT is that the algorithm can directly output a prefix tree representing all frequent itemsets after slight modifications. Using only one dynamic prefix tree, DPT avoids the high cost of constructing many prefix trees and thus gains significant performance improvement. Experimental results show that DPT remarkably outperforms previous algorithms with respect to running time and memory usage, and that a prefix tree representing all frequent itemsets DPT outputs can be used more efficient than a list representing them previous algorithms output.
引用
收藏
页码:183722 / 183735
页数:14
相关论文
共 37 条
[1]   DRFP-tree: disk-resident frequent pattern tree [J].
Adnan, Muhaimenul ;
Alhajj, Reda .
APPLIED INTELLIGENCE, 2009, 30 (02) :84-97
[2]   A tree projection algorithm for generation of frequent item sets [J].
Agarwal, RC ;
Aggarwal, CC ;
Prasad, VVV .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2001, 61 (03) :350-371
[3]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[4]  
Agrawal R., 1994, 20 INT C VER LARG DA, P487
[5]   LUIM: New Low-Utility Itemset Mining Framework [J].
Alhusaini, Naji ;
Karmoshi, Saleem ;
Hawbani, Ammar ;
Jing, Li ;
Alhusaini, Abdullah ;
Al-Sharabi, Yaser .
IEEE ACCESS, 2019, 7 :100535-100551
[6]   High performance evaluation of evolutionary-mined association rules on GPUs [J].
Cano, Alberto ;
Maria Luna, Jose ;
Ventura, Sebastian .
JOURNAL OF SUPERCOMPUTING, 2013, 66 (03) :1438-1461
[7]   Approximation of Probabilistic Maximal Frequent Itemset Mining Over Uncertain Sensed Data [J].
Chen, Sheng ;
Nie, Lihai ;
Tao, Xiaoyi ;
Li, Zhiyang ;
Zhao, Laiping .
IEEE ACCESS, 2020, 8 :97529-97539
[8]  
Chen XY, 2006, INT CONF E BUS ENG, P466
[9]   Exploiting GPU and cluster parallelism in single scan frequent itemset mining [J].
Djenouri, Youcef ;
Djenouri, Djamel ;
Belhadi, Asma ;
Cano, Alberto .
INFORMATION SCIENCES, 2019, 496 :363-377
[10]  
El-Hajj M., 2003, P ICDM WORKSH FIMI