Mining top-k frequent closed itemsets over data streams using the sliding window model

被引:20
作者
Tsai, Pauray S. M. [1 ]
机构
[1] Minghsin Univ Sci & Technol, Dept Comp Sci & Informat Engn, Hsinchu, Taiwan
关键词
Data mining; Data stream; Association rule; Frequent closed itemset; Sliding window;
D O I
10.1016/j.eswa.2010.03.023
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Association rule mining is an important research topic in the data mining community. There are two difficulties occurring in mining association rules. First, the user must specify a minimum support for mining. Typically it may require tuning the value of the minimum support many times before a set of useful association rules could be obtained. However, it is not easy for the user to find an appropriate minimum support. Secondly, there are usually a lot of frequent itemsets generated in the mining result. It will result in the generation of a large number of association rules, giving rise to difficulties of applications. In this paper, we consider mining top-k frequent closed itemsets from data streams using a sliding window technique. A single pass algorithm, called FCl_max, is developed for the generation of top-k frequent closed itemsets of length no more than max_l. Our method can efficiently resolve the mentioned two difficulties in association rule mining, which promotes the usability of the mining result in practice. (C) 2010 Elsevier Ltd. All rights reserved.
引用
收藏
页码:6968 / 6973
页数:6
相关论文
共 20 条
[1]  
Agrawal R., 1994, P 20 INT C VER LARG, P487, DOI DOI 10.5555/645920.672836
[2]   An Efficient Algorithm for Mining Closed Frequent Itemsets in Data Streams [J].
Ao, Fujiang ;
Du, Jing ;
Yan, Yuejin ;
Liu, Baohong ;
Huang, Kedi .
8TH IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY WORKSHOPS: CIT WORKSHOPS 2008, PROCEEDINGS, 2008, :37-+
[3]  
CABER MM, 2005, SIGMOD, P18
[4]  
Chang J.H., 2003, P 9 ACM SIGKDD INT C, P487, DOI DOI 10.1145/956750.956807
[5]  
Chi Y., 2004, P IEEE INT C DAT MIN
[6]   Catch the moment: maintaining closed frequent itemsets over a data stream sliding window [J].
Chi, Yun ;
Wang, Haixun ;
Yu, Philip S. ;
Muntz, Richard R. .
KNOWLEDGE AND INFORMATION SYSTEMS, 2006, 10 (03) :265-294
[7]  
Giannella C., 2003, Next Gener. Data Min, V212, P191
[8]  
GOUDA K, 2001, P IEEE INT C DAT ENG
[9]  
HAN J, 2000, P 2000 ACM SIGMOD IN, P1, DOI DOI 10.1145/342009.335372
[10]   Research issues in data stream association rule mining [J].
Jiang, N ;
Gruenwald, L .
SIGMOD RECORD, 2006, 35 (01) :14-19