RWFIM: Recent weighted-frequent itemsets mining

被引:33
作者
Lin, Jerry Chun-Wei [1 ,2 ]
Gan, Wensheng [1 ]
Fournier-Viger, Philippe [3 ]
Hong, Tzung-Pei [4 ,5 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Shenzhen Grad Sch, Shenzhen 518055, Peoples R China
[2] Harbin Inst Technol, Shenzhen Key Lab Internet Informat Collaborat, Shenzhen Grad Sch, Shenzhen 518055, Peoples R China
[3] Univ Moncton, Dept Comp Sci, Moncton, NB E1A 3E9, Canada
[4] Natl Univ Kaohsiung, Dept Comp Sci & Informat Engn, Kaohsiung, Taiwan
[5] Natl Sun Yat Sen Univ, Dept Comp Sci & Engn, Kaohsiung 80424, Taiwan
关键词
Weighted frequent itemsets mining; Time-sensitive constraint; Recent pattern; Projected-based; EW2P strategy; ASSOCIATION RULES; PATTERNS;
D O I
10.1016/j.engappai.2015.06.009
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, weighted frequent itemsets mining (WFIM) has become a critical issue of data mining, which can be used to discover more useful and interesting patterns in real-world applications instead of the traditional frequent itemsets mining. Many algorithms have been developed to find weighted frequent itemsets (WFIs) without time-sensitive consideration. The discovered out-of-date information may, however, be meaningless and useless in decision making. In this paper, a novel framework, namely recent weighted-frequent itemsets mining (RWFIM) is proposed to concern both the weight and time-sensitive constraints. A projected-based RWFIM-P algorithm is first proposed for mining the designed recent weighted-frequent itemsets (RWFIs) with weight and time-sensitive consideration. It uses the projection-and-test mechanism to discover RWFIs in a recursive way. Based on the developed RWFIM-P algorithm, the entire database can be projected and divided into several sub-databases according to the currently processed itemset, thus reducing the computational costs and memory requirements. The second RWFIM-PE algorithm is also proposed to improve the performance of the first RWFIM-P algorithm based on the developed Estimated Weight of 2-itemset Pruning (EW2P) strategy to mine the RWFIs without generating the unpromising candidates, thus avoiding the computations of the projection mechanism compared to the first RWFIM-P algorithm. Experiments are conducted to evaluate the performance of the proposed two algorithms compared to the traditional WFIM in terms of execution time, number of generated RWFIs and scalability under varied two minimum thresholds in several real-world and synthetic datasets. (C) 2015 Elsevier Ltd. All rights reserved.
引用
收藏
页码:18 / 32
页数:15
相关论文
共 25 条
[1]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[2]  
AGRAWAL R, 1995, PROC INT CONF DATA, P3, DOI 10.1109/ICDE.1995.380415
[3]  
Agrawal R., 1994, Quest Synthetic Data Generator
[4]  
Agrawal R., P 20 INT C VERY LARG
[5]  
[Anonymous], 2012, Frequent itemset mining dataset repository
[6]  
[Anonymous], 2003, P 9 ACM SIGKDD INT C
[7]   A new method for mining Frequent Weighted Itemsets based on WIT-trees [J].
Bay Vo ;
Coenen, Frans ;
Bac Le .
EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (04) :1256-1264
[8]   Mining Undominated Association Rules Through Interestingness Measures [J].
Bouker, Slim ;
Saidi, Rabie ;
Ben Yahia, Sadok ;
Nguifo, Engelbert Mephu .
INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2014, 23 (04)
[9]   Ranking and selecting association rules based on dominance relationship [J].
Bouker, Slim ;
Saidi, Rabie ;
Ben Yahia, Sadok ;
Nguifo, Engelbert Mephu .
2012 IEEE 24TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2012), VOL 1, 2012, :658-665
[10]   Mining association rules with weighted items [J].
Cai, CH ;
Fu, AWC ;
Cheng, CH ;
Kwong, WW .
IDEAS 98 - INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM, PROCEEDINGS, 1998, :68-77