Incrementally fast updated frequent pattern trees

被引:135
作者
Hong, Tzung-Pei [1 ]
Lin, Chun-Wei [2 ]
Wu, Yu-Lung [2 ]
机构
[1] Natl Univ Kaohsiung, Dept Elect Engn, Kaohsiung 811, Taiwan
[2] I Shou Univ, Dept Informat Management, Kaohsiung 84008, Taiwan
关键词
data mining; FP-tree; FUFP-tree; incremental mining; maintenance;
D O I
10.1016/j.eswa.2007.04.009
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The frequent-pattern-tree (FP-tree) is an efficient data structure for association-rule mining without generation of candidate itemsets. It was used to compress a database into a tree structure which stored only large items. It, however, needed to process all transactions in a batch way. In real-world applications, new transactions are usually inserted into databases. In this paper, we thus attempt to modify the FP-tree construction algorithm for efficiently handling new transactions. A fast updated FP-tree (FUFP-tree) structure is proposed, which makes the tree update process become easier. An incremental FUFP-tree maintenance algorithm is also proposed for reducing the execution time in reconstructing the tree when new transactions are inserted. Experimental results also show that the proposed FUFP-tree maintenance algorithm runs faster than the batch FP-tree construction algorithm for handling new transactions and generates nearly the same tree structure as the FP-tree algorithm. The proposed approach can thus achieve a good trade-off between execution time and tree complexity. (c) 2007 Elsevier Ltd. All rights reserved.
引用
收藏
页码:2424 / 2435
页数:12
相关论文
共 22 条
[1]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[2]   DATABASE MINING - A PERFORMANCE PERSPECTIVE [J].
AGRAWAL, R ;
IMIELINSKI, T ;
SWAMI, A .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1993, 5 (06) :914-925
[3]  
AGRAWAL R, 1995, PROC INT CONF DATA, P3, DOI 10.1109/ICDE.1995.380415
[4]  
AGRAWAL R, 1997, 3 INT C KNOWL DISC D, P67
[5]  
Agrawal R., 1994, Proceedings of the 20th International Conference on Very Large Data Bases. VLDB'94, P487
[6]  
[Anonymous], 21 INT C VLDB
[7]   Data mining: An overview from a database perspective [J].
Chen, MS ;
Han, JW ;
Yu, PS .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1996, 8 (06) :866-883
[8]   Maintenance of discovered association rules in large databases: Art incremental updating technique [J].
Cheung, DW ;
Han, JW ;
Ng, VT ;
Wong, CY .
PROCEEDINGS OF THE TWELFTH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, 1996, :106-114
[9]  
CHEUNG DW, 1997, P 5 INT C DAT SYST A, P185
[10]  
EZEIFE CI, 2002, C CAN SOC COMP STUD, P147