Mining maximal frequent patterns by considering weight conditions over data streams

被引:67
作者
Yun, Unil [1 ]
Lee, Gangin [1 ]
Ryu, Keun Ho [2 ]
机构
[1] Sejong Univ, Dept Comp Engn, Seoul, South Korea
[2] Chungbuk Natl Univ, Dept Comp Sci, Cheongju, South Korea
基金
新加坡国家研究基金会;
关键词
Data stream; Data mining; Maximal frequent pattern mining; Weight condition; Knowledge discovery; SEQUENTIAL PATTERNS; EFFICIENT ALGORITHMS; ITEMSETS;
D O I
10.1016/j.knosys.2013.10.011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Frequent pattern mining over data streams is currently one of the most interesting fields in data mining. Current databases have needed more immediate processes since enormous amounts of data are being accumulated and updated in real time. However, existing traditional approaches have not been entirely suitable for a data stream environment since they operate with more than two database scans. Moreover, frequent pattern mining over data streams mostly generates an enormous number of frequent patterns, thereby causing a significant amount of overheads. In addition, as weight conditions are very useful factors in reflecting importance for each object in the real world, it is necessary to apply them to the mining process in order to obtain more practical, meaningful patterns. To consider and solve these problems, we propose a novel method for mining Weighted Maximal Frequent Patterns (WMFPs) over data streams, called MWS (Maximal frequent pattern mining with Weight conditions over data Streams). MWS guarantees efficient mining performance in the data stream environment by scanning stream databases only once, and prevents overheads of pattern extractions with an abbreviated notation: a maximal frequent pattern form instead of the general one. Furthermore, MWS contributes to enhanced reliability of the mining results by applying weight conditions to each element of the data streams. Extensive experiments report that MWS has outstanding performance in comparison to previous algorithms. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:49 / 65
页数:17
相关论文
共 47 条
[1]  
Agrawal R., P 20 INT C VERY LARG
[2]   Single-pass incremental and interactive mining for weighted frequent patterns [J].
Ahmed, Chowdhury Farhan ;
Tanbeer, Syed Khairuzzaman ;
Jeong, Byeong-Soo ;
Lee, Young-Koo ;
Choi, Ho-Jin .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (09) :7976-7994
[3]   Efficient Mining of Weighted Frequent Patterns Over Data Streams [J].
Ahmed, Chowdhury Farhan ;
Tanbeer, Syed Khairuzzaman ;
Jeong, Byeong-Soo .
HPCC: 2009 11TH IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2009, :400-406
[4]  
[Anonymous], J COMPUT INF SYST
[5]  
[Anonymous], 2011, P 17 ACM SIGKDD INT
[6]  
García-Hernández RA, 2010, INFORM-J COMPUT INFO, V34, P93
[7]   A new method for mining Frequent Weighted Itemsets based on WIT-trees [J].
Bay Vo ;
Coenen, Frans ;
Bac Le .
EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (04) :1256-1264
[8]  
Bifet Albert., 2011, P 17 ACM SIGKDD INT, P591, DOI DOI 10.1145/2020408.2020501
[9]  
Bogorny V, 2006, IEEE DATA MINING, P813
[10]   MAFIA: A maximal frequent itemset algorithm [J].
Burdick, D ;
Calimlim, M ;
Flannick, J ;
Gehrke, J ;
Yiu, TM .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (11) :1490-1504