Mining frequent closed itemsets from a landmark window over online data streams

被引:26
作者
Liu, Xuejun [1 ,3 ]
Guan, Jihong [2 ]
Hu, Ping [3 ]
机构
[1] Fudan Univ, Dept Comp Sci & Engn, Shanghai 200433, Peoples R China
[2] Tongji Univ, Dept Comp Sci & Technol, Shanghai 200092, Peoples R China
[3] Nanjing Univ Technol, Coll Informat Sci & Engn, Nanjing 210009, Peoples R China
基金
中国国家自然科学基金;
关键词
Data streams; Frequent closed items; Landmark window; Data mining;
D O I
10.1016/j.camwa.2008.10.060
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
The frequent closed itemsets determine exactly the complete set of frequent itemsets and are usually much smaller than the later. However, mining frequent closed itemsets from a landmark window over data streams is a challenging problem. To solve the problem, this paper presents a novel algorithm (called FP-CDS) that can capture all frequent closed itemsets and a new storage structure (called FP-CDS tree) that can be dynamically adjusted to reflect the evolution of itemsets' frequencies over time. A landmark window is divided into several basic windows and these basic windows are used as updating units. Potential frequent closed itemsets in each basic window are mined and stored in FP-CDS tree based on some proposed strategies. Extensive experiments are conducted to validate the proposed method. (C) 2008 Elsevier Ltd. All rights reserved.
引用
收藏
页码:927 / 936
页数:10
相关论文
共 20 条
[1]  
ASAI T, 2002, IEEE INT C DAT MIN I
[2]  
Chang J. H., 2003, 9 ACM SIGKDD INT C K
[3]  
Charikar M., 2002, P 29 INT C AUT LANG
[4]  
CHI Y, 2004, INT C DAT MIN NOV
[5]  
Cormode G, 2003, ACM S PRINC DAT SYST
[6]  
CORMODE G, 2003, INT C VER LARG DAT B
[7]  
Giannella C., 2003, NEXT GENERATION DATA
[8]  
JIANG N, 2006, ACM INT C KNOWL DAT
[9]  
JIN C, 2003, C INF KNOWL MAN CIKM
[10]  
JIN R, 2005, P 5 IEEE INT C DAT M