Mining Discriminative Itemsets Over Data Streams Using Efficient Sliding Window

被引:0
|
作者
Seyfi M. [1 ]
Nayak R. [1 ]
Xu Y. [1 ]
机构
[1] Data Science Discipline, Science and Engineering Faculty, Queensland University of Technology, Brisbane, QLD
关键词
Data stream mining; Discriminative itemsets; Prefix tree; Sliding window model;
D O I
10.1007/s42979-023-01887-x
中图分类号
学科分类号
摘要
In this paper, we present an efficient novel method for mining discriminative itemsets over data streams using the sliding window model. Discriminative itemsets are the itemsets that are frequent in the target data stream, and their frequency in the target stream is much higher in comparison to their frequency in the rest of the streams. The problem of mining discriminative itemsets has more challenges than mining frequent itemsets, especially in the sliding window model, as during the window frame sliding, the algorithms have to deal with the combinatorial explosion of itemsets in more than one data stream, for the transactions coming in and going out of the sliding window. We propose a single scan algorithm using two novel in-memory data structures for mining discriminative itemsets in a combination of offline and online sliding windows. Offline processing is used for controlling the generation of many unpromising itemsets. Online processing is used for getting more up-to-date and accurate online answers between two offline slidings. The discovered discriminative itemsets are accurately updated in the offline sliding window periodically, and the mining process is continued in the online sliding between two periodic offline slidings. The extensive empirical analysis shows that the proposed algorithm provides efficient time and space complexities with full accuracy. The algorithm can handle large, fast-speed, and complex data streams. © 2023, The Author(s).
引用
收藏
相关论文
共 50 条
  • [1] Mining frequent itemsets over data streams using efficient window sliding techniques
    Li, Hua-Fu
    Lee, Suh-Yin
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) : 1466 - 1477
  • [2] Mining weighted frequent itemsets using window sliding over data streams
    Kim, Younghee
    Kim, Wonyoung
    Ryu, Joonsuk
    Kim, Ungmo
    ICCIT: 2009 FOURTH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCES AND CONVERGENCE INFORMATION TECHNOLOGY, VOLS 1 AND 2, 2009, : 708 - 713
  • [3] Efficient maintenance and mining of frequent itemsets over Online data streams with a sliding window
    Hua-Fu Li
    Chin-Chuan Ho
    Man-Kwan Shan
    Suh-Yin Lee
    2006 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-6, PROCEEDINGS, 2006, : 2672 - +
  • [4] Mining maximal frequent itemsets in a sliding window over data streams
    Mao Y.
    Li H.
    Yang L.
    Liu L.
    Gaojishu Tongxin/Chinese High Technology Letters, 2010, 20 (11): : 1142 - 1148
  • [5] Mining Approximate Frequent Itemsets over Data Streams Using Window Sliding Techniques
    Kim, Younghee
    Park, Eunkyoung
    Kim, Ungmo
    DATABASE THEORY AND APPLICATION, 2009, 64 : 49 - 56
  • [6] Mining Recent Maximal Frequent Itemsets Over Data Streams with Sliding Window
    Cai, Saihua
    Hao, Shangbo
    Sun, Ruizhi
    Wu, Gang
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2019, 16 (06) : 961 - 969
  • [7] Mining frequent itemsets in data streams using the weighted sliding window model
    Tsai, Pauray S. M.
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (09) : 11617 - 11625
  • [8] A frequent itemsets mining algorithm based on matrix in sliding window over data streams
    Fan Guidan
    Yin Shaohong
    2013 THIRD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM DESIGN AND ENGINEERING APPLICATIONS (ISDEA), 2013, : 66 - 69
  • [9] Mining discriminative itemsets in data streams using the tilted-time window model
    Majid Seyfi
    Richi Nayak
    Yue Xu
    Shlomo Geva
    Knowledge and Information Systems, 2021, 63 : 1241 - 1270
  • [10] Mining discriminative itemsets in data streams using the tilted-time window model
    Seyfi, Majid
    Nayak, Richi
    Xu, Yue
    Geva, Shlomo
    KNOWLEDGE AND INFORMATION SYSTEMS, 2021, 63 (05) : 1241 - 1270