Mining frequent itemsets in data streams within a time horizon

被引:19
作者
Troiano, Luigi [1 ]
Scibelli, Giacomo [1 ]
机构
[1] Univ Sannio, Dept Engn, I-82100 Benevento, Italy
关键词
Data mining; Mining methods and algorithms; Frequent itemsets; ALGORITHM;
D O I
10.1016/j.datak.2013.10.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present an algorithm for mining frequent itemsets in a stream of transactions within a limited time horizon. In contrast to other approaches that are presented in the literature, the proposed algorithm makes use of a test window that can discard non-frequent itemsets from a set of candidates. The efficiency of this approach relies on the property that the higher the support threshold is, the smaller the test window is. In addition to considering a sharp horizon, we consider a smooth window. Indeed, in many applications that are of practical interest, not all of the time slots have the same relevance, e.g., more recent slots can be more interesting than older slots. Smoothness can be determined in both qualitative and quantitative terms. A comparison to other algorithms is conducted. The experimental results prove that the proposed solution is faster than other approaches but has a slightly higher cost in terms of memory. (C) 2014 Elsevier BM. All rights reserved.
引用
收藏
页码:21 / 37
页数:17
相关论文
共 32 条
[1]   DATABASE MINING - A PERFORMANCE PERSPECTIVE [J].
AGRAWAL, R ;
IMIELINSKI, T ;
SWAMI, A .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1993, 5 (06) :914-925
[2]  
Agrawal R., 1993, ACM SIGMOND INT C MA
[3]  
AGRAWAL R, 1994, 20 VLDB C
[4]  
Agrawal R., 1996, ADV KNOWLEDGE DISCOV
[5]  
[Anonymous], 1991, KNOWLEDGE DISCOVERY
[6]  
[Anonymous], 1994, KDD
[7]  
Ao FJ, 2007, LECT NOTES ARTIF INT, V4571, P479
[8]  
Bastide Y., 2000, SIGKDD EXPLORATIONS, V2, P66, DOI DOI 10.1145/380995.381017
[9]  
Brin S., 1997, SIGMOD Record, V26, P255, DOI [10.1145/253262.253327, 10.1145/253262.253325]
[10]  
Chang J.H., 2003, P 9 ACM SIGKDD INT C, P487, DOI DOI 10.1145/956750.956807