On the design of hardware-software architectures for frequent itemsets mining on data streams

被引:0
作者
Lázaro Bustio-Martínez
René Cumplido
Raudel Hernández-León
José M. Bande-Serrano
Claudia Feregrino-Uribe
机构
[1] National Institute for Astrophysics,Computer Sciences Department
[2] Optics,undefined
[3] and Electronics,undefined
[4] Advanced Technologies Application Center,undefined
来源
Journal of Intelligent Information Systems | 2018年 / 50卷
关键词
Data Mining; Frequent Itemsets Mining; Data streams; Reconfigurable Hardware; Parallel algorithms;
D O I
暂无
中图分类号
学科分类号
摘要
Frequent Itemsets Mining has been applied in many data processing applications with remarkable results. Recently, data streams processing is gaining a lot of attention due to its practical applications. Data in data streams are transmitted at high rates and cannot be stored for offline processing making impractical to use traditional data mining approaches (such as Frequent Itemsets Mining) straightforwardly on data streams. In this paper, two single-pass parallel algorithms based on a tree data structure for Frequent Itemsets Mining on data streams are proposed. The presented algorithms employ Landmark and Sliding Window Models for windows handling. In the presented paper, as in other revised papers, if the number of frequent items on data streams is low then the proposed algorithms perform an exact mining process. On the contrary, if the number of frequent patterns is large the mining process is approximate with no false positives produced. Experiments conducted demonstrate that the presented algorithms outperform the processing time of the hardware architectures reported in the state-of-the-art.
引用
收藏
页码:415 / 440
页数:25
相关论文
共 39 条
[1]  
Bai-En S(2012)Efficient algorithms for mining maximal high utility itemsets from Data Streams with different models Expert Systems with Applications 39 12,947-12,960
[2]  
Philip S(2008)A survey on algorithms for mining frequent itemsets over data streams Knowledge and Information Systems 16 1-27
[3]  
Vincent S(2002)Reconfigurable computing: a survey of systems and software ACM Computing Surveys (csuR) 34 171-210
[4]  
Cheng J(2009)Finding the Frequent Items in streams of data Communications of the ACM 52 97-105
[5]  
Ke Y(2003)Mining frequent patterns in Data Streams at multiple time granularities Next Generation Data Mining 212 191-212
[6]  
Ng W(2006)Research issues in Data Stream association rule mining SIGMOD Record 35 14-19
[7]  
Compton K(2000)Adaptive intrusion detection: A data mining approach Artificial Intelligence Review 14 533-567
[8]  
Hauck S(2006)An integrated efficient solution for computing frequent and top-k elements in Data Streams ACM Transactions Database Systems 31 1095-1133
[9]  
Cormode G(2011)Design and Analysis of a Reconfigurable Platform for Frequent Pattern Mining IEEE Transactions on Parallel and Distributed Systems 22 1497-1505
[10]  
Hadjieleftheriou M(2011)Alonso, G.: Frequent Item Computation on a Chip IEEE Transactions on Knowledge and Data Engineering 23 1169-1181