Mining high utility itemsets based on the time decaying model

被引:30
作者
Kim, Donggyu [1 ]
Yun, Unil [1 ]
机构
[1] Sejong Univ, Dept Comp Engn, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
Data mining; high utility itemset mining; stream data mining; time decaying model; TRANSACTIONAL DATA STREAMS; FREQUENT ITEMSETS; SLIDING-WINDOW; EFFICIENT ALGORITHMS; PATTERNS;
D O I
10.3233/IDA-160861
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Various association rule mining techniques, such as frequent itemset mining, sequence itemset mining, and high utility itemset mining, have been studied to reveal valuable knowledge hidden from large databases. Among these techniques, high utility itemset mining has been researched actively by many researchers because of its characteristics that can find more meaningful itemsets compared to those of other approaches by considering the utility of each item in a given database. In recent years, mining high utility itemsets over data streams has emerged as an interesting topic because many users want to obtain valuable information from stream data, which are continually generated at rapid rates. However, in these environments, most of the previous high utility itemset mining methods cannot efficiently work in terms of both runtime and memory usage. In addition, since they conduct their mining processes without any consideration of transactions' arrival-time, it is hard for these methods to sufficiently fulfill the needs of users when they want to obtain only up to date, relevant information over data streams. In this paper, we propose a new tree-based algorithm that mines recent high utility itemsets over data streams. On the basis of the time decaying model, our algorithm diminishes the utilities of transactions according to their arrival-time in order to assign larger weights to recent data compared to those of older ones. Moreover, the algorithm regularly updates the utility information in its tree data structure and prunes the nodes with the utility values less than a user-specified minimum value. Thereby, the algorithm can maintain a reasonable memory usage bound by avoiding memory use that is unessential. Experimental results demonstrate that our algorithm can mine recent high utility itemsets from varying stream data while consuming smaller computational resources than those of the existing algorithms.
引用
收藏
页码:1157 / 1180
页数:24
相关论文
共 34 条
[1]  
Agrawal R., P 20 INT C VERY LARG
[2]   Interactive mining of high utility patterns over data streams [J].
Ahmed, Chowdhury Farhan ;
Tanbeer, Syed Khairuzzaman ;
Jeong, Byeong-Soo ;
Choi, Ho-Jin .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (15) :11979-11991
[3]   Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases [J].
Ahmed, Chowdhury Farhan ;
Tanbeer, Syed Khairuzzaman ;
Jeong, Byeong-Soo ;
Lee, Young-Koo .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (12) :1708-1721
[4]   Scalable out-of-core itemset mining [J].
Baralis, Elena ;
Cerquitelli, Tania ;
Chiusano, Silvia ;
Grand, Alberto .
INFORMATION SCIENCES, 2015, 293 :146-162
[5]   Finding recently frequent itemsets adaptively over online transactional data streams [J].
Chang, Joong Hyuk ;
Lee, Won Suk .
INFORMATION SYSTEMS, 2006, 31 (08) :849-869
[6]   Mining frequent patterns in a varying-size sliding window of online transactional data streams [J].
Chen, Hui ;
Shu, Lihchyun ;
Xia, Jiali ;
Deng, Qingshan .
INFORMATION SCIENCES, 2012, 215 :15-36
[7]   Mining frequent items in data stream using time fading model [J].
Chen, Ling ;
Mei, Qingling .
INFORMATION SCIENCES, 2014, 257 :54-69
[8]   An efficient algorithm for mining temporal high utility itemsets from data streams [J].
Chu, Chun-Jung ;
Tseng, Vincent S. ;
Liang, Tyne .
JOURNAL OF SYSTEMS AND SOFTWARE, 2008, 81 (07) :1105-1117
[9]  
Feigenbaum J., 1999, SIAM J COMPUT, V32, P131
[10]   UT-Tree: Efficient mining of high utility itemsets from data streams [J].
Feng, Lin ;
Wang, Le ;
Jin, Bo .
INTELLIGENT DATA ANALYSIS, 2013, 17 (04) :585-602