FPS-Tree Algorithm to Find Top-k Closed Itemsets in Data Streams

被引:5
作者
Rehman, Zahoor Ur [1 ]
Shahbaz, Muhammad [2 ]
Shaheen, Muhammad [3 ]
Guergachi, Aziz [4 ]
机构
[1] CIIT Ctr Hlth Res, Attock, Pakistan
[2] Univ Engn & Technol Lahore, Lahore, Pakistan
[3] FUIEMS, Rawalpindi, Pakistan
[4] Ryerson Univ, ITM, Toronto, ON, Canada
关键词
Data stream mining; Closed frequent itemsets; Approximate mining; Sliding window; FP-tree; MINING FREQUENT ITEMSETS; SLIDING WINDOW; EFFICIENT ALGORITHM; PATTERNS;
D O I
10.1007/s13369-015-1811-x
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Frequent itemset mining has become a popular research area in the data mining community and has been applied in various areas over the last few years. There are two main technical hitches when searching for frequent itemsets. The first is to provide an appropriate minimum support value, and the second is the generation of a large number of association rules. In many cases, users are only interested in finding only top-k frequent itemsets with some defined threshold value. In this paper, we present an algorithm to mine top-k frequent closed itemsets from streaming data using a sliding window approach. A fast algorithm is proposed to find frequent closed itemsets with user-defined minimum and maximum lengths to reduce the number of frequent itemsets. Moreover, we have also incorporated a bitmap-based data structure to improve the performance by eliminating multiple scans. Different datasets have been used for experimentation and to benchmark the proposed technique and algorithm.
引用
收藏
页码:3507 / 3521
页数:15
相关论文
共 44 条
  • [1] Ada Wai-Chee Fu, 2000, Foundations of Intelligent Systems. 12th International Symposium, ISMIS 2000. Proceedings (Lecture Notes in Artificial Intelligence Vol.1932), P59
  • [2] Aggarwal C.C., 2006, DATA STREAMS MODELS
  • [3] Agrawal Rakesh., 1994, P 20 INT C VER LARG, P487
  • [4] An Efficient Algorithm for Mining Closed Frequent Itemsets in Data Streams
    Ao, Fujiang
    Du, Jing
    Yan, Yuejin
    Liu, Baohong
    Huang, Kedi
    [J]. 8TH IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY WORKSHOPS: CIT WORKSHOPS 2008, PROCEEDINGS, 2008, : 37 - +
  • [5] Chang Joong Hyuk, 2003, P 9 ACM SIGKDD INT C, P487, DOI DOI 10.1145/956750.956807
  • [6] Maintaining frequent closed itemsets over a sliding window
    Cheng, James
    Ke, Yiping
    Ng, Wilfred
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2008, 31 (03) : 191 - 215
  • [7] Mining frequent itemsets without support threshold: With and without item constraints
    Cheung, YL
    Fu, AWC
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2004, 16 (09) : 1052 - 1069
  • [8] Moment: Maintaining closed frequent itemsets over a stream sliding window
    Chi, Y
    Wang, HX
    Yu, PS
    Muntz, RR
    [J]. FOURTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2004, : 59 - 66
  • [9] Catch the moment: maintaining closed frequent itemsets over a data stream sliding window
    Chi, Yun
    Wang, Haixun
    Yu, Philip S.
    Muntz, Richard R.
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2006, 10 (03) : 265 - 294
  • [10] A dynamic layout of sliding window for frequent itemset mining over data streams
    Deypir, Mahmood
    Sadreddini, Mohammad Hadi
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2012, 85 (03) : 746 - 759