Interactive mining of high utility patterns over data streams

被引:69
作者
Ahmed, Chowdhury Farhan [1 ]
Tanbeer, Syed Khairuzzaman [1 ]
Jeong, Byeong-Soo [1 ]
Choi, Ho-Jin [2 ]
机构
[1] Kyung Hee Univ, Dept Comp Engn, Youngin Si 446701, Kyunggi Do, South Korea
[2] Korea Adv Inst Sci & Technol, Dept Comp Sci, Taejon 305701, South Korea
关键词
Data mining; Knowledge discovery; High utility pattern mining; Interactive mining; Incremental mining; Data streams; ASSOCIATION RULES; SLIDING-WINDOW; EFFICIENT ALGORITHM; FREQUENT ITEMSETS; TREE; SETS;
D O I
10.1016/j.eswa.2012.03.062
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High utility pattern (HUP) mining over data streams has become a challenging research issue in data mining. When a data stream flows through, the old information may not be interesting in the current time period. Therefore, incremental HUP mining is necessary over data streams. Even though some methods have been proposed to discover recent HUPs by using a sliding window, they suffer from the level-wise candidate generation-and-test problem. Hence, they need a large amount of execution time and memory. Moreover, their data structures are not suitable for interactive mining. To solve these problems of the existing algorithms, in this paper, we propose a novel tree structure, called HUS-tree (high utility stream tree) and a new algorithm, called HUPMS (high utility pattern mining over stream data) for incremental and interactive HUP mining over data streams with a sliding window. By capturing the important information of stream data into an HUS-tree, our HUPMS algorithm can mine all the HUPs in the current window with a pattern growth approach. Furthermore, HUS-tree is very efficient for interactive mining. Extensive performance analyses show that our algorithm is very efficient for incremental and interactive HUP mining over data streams and significantly outperforms the existing sliding window-based HUP mining algorithms. (C) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:11979 / 11991
页数:13
相关论文
共 48 条
[1]   DRFP-tree: disk-resident frequent pattern tree [J].
Adnan, Muhaimenul ;
Alhajj, Reda .
APPLIED INTELLIGENCE, 2009, 30 (02) :84-97
[2]  
Agrawal R., P 20 INT C VERY LARG
[3]   Efficient Mining of High Utility Patterns over Data Streams with a Sliding Window Method [J].
Ahmed, Chowdhury Farhan ;
Tanbeer, Syed Khairuzzaman ;
Jeong, Byeong-Soo .
SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL-DISTRIBUTED COMPUTING 2010, 2010, 295 :99-113
[4]   Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases [J].
Ahmed, Chowdhury Farhan ;
Tanbeer, Syed Khairuzzaman ;
Jeong, Byeong-Soo ;
Lee, Young-Koo .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (12) :1708-1721
[5]  
[Anonymous], 1999, FREQUENT ITEMSET MIN
[6]  
[Anonymous], 2003, Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
[7]  
Chan R, 2003, THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, P19
[8]  
Chang J. H., 2004, J INF SCI ENG, V20, P753
[9]   estWin:: Online data stream mining of recent frequent itemsets by sliding window method [J].
Chang, JH ;
Lee, WS .
JOURNAL OF INFORMATION SCIENCE, 2005, 31 (02) :76-90
[10]   Finding, Frequent Closed Itemsets in Sliding Window in Linear Time [J].
Chen, Junbo ;
Zhou, Bo ;
Chen, Lu ;
Wang, Xinyu ;
Ding, Yiqun .
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (10) :2406-2418