HashEclat: an efficient frequent itemset algorithm

被引:27
|
作者
Zhang, Chunkai [1 ]
Tian, Panbo [1 ]
Zhang, Xudong [1 ]
Liao, Qing [1 ]
Jiang, Zoe L. [1 ]
Wang, Xuan [1 ]
机构
[1] Harbin Inst Technol Shenzhen, Dept Comp Sci & Technol, Shenzhen, Peoples R China
关键词
Frequent itemset; MinHash; Approximate algorithm; Eclat;
D O I
10.1007/s13042-018-00918-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Eclat algorithm is one of the most widely used frequent itemset mining methods. However, the inefficiency for calculating the intersection of itemsets makes it a time-consuming method, especially when the dataset has a large number of transactions. In this work, for the purpose of efficiency improvement, we proposed an approximate Eclat algorithm named HashEclat based on MinHash, which could quickly estimate the size of the intersection set, and adjust the parameters k, E and minSup to consider the tradeoff between accuracy of the mining results and execution time. The parameter k is the top-k parameter of one-permutation MinHash algorithm; the parameter E is the estimate error of one intersection size; the parameter minSup is the minimum support threshold. In many real situations, an approximate result with faster speed maybe more useful than 'exact' result. The theoretical analysis and experiment results that we present demonstrate that the proposed algorithm can output almost all of the frequent itemset with faster speed and less memory space.
引用
收藏
页码:3003 / 3016
页数:14
相关论文
共 50 条
  • [31] Discovering frequent itemsets by support approximation and itemset clustering
    Jea, Kuen-Fang
    Chang, Ming-Yuan
    DATA & KNOWLEDGE ENGINEERING, 2008, 65 (01) : 90 - 107
  • [32] An FPGA-Based Accelerator for Frequent Itemset Mining
    Zhang, Yan
    Zhang, Fan
    Jin, Zheming
    Bakos, Jason D.
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2013, 6 (01)
  • [33] A new approach for mining frequent K-itemset
    Sankar, H. Ravi
    Naidu, M. M.
    WCECS 2007: WORLD CONGRESS ON ENGINEERING AND COMPUTER SCIENCE, 2007, : 718 - +
  • [34] Recommendation using Frequent Itemset Mining in Big Data
    Kunjachan, Honeytta
    Hareesh, M. J.
    Sreedevi, K. M.
    PROCEEDINGS OF THE 2018 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), 2018, : 561 - 566
  • [35] Model-based probabilistic frequent itemset mining
    Thomas Bernecker
    Reynold Cheng
    David W. Cheung
    Hans-Peter Kriegel
    Sau Dan Lee
    Matthias Renz
    Florian Verhein
    Liang Wang
    Andreas Zuefle
    Knowledge and Information Systems, 2013, 37 : 181 - 217
  • [36] An Efficient Algorithm for Mining Frequent Itemsets with Single Constraint
    Hai Duong
    Tin Truong
    Bac Le
    ADVANCED COMPUTATIONAL METHODS FOR KNOWLEDGE ENGINEERING, 2013, 479 : 367 - 378
  • [37] A Frequent Itemset Mining Method Based on Local Differential Privacy
    Wu, Ning
    Zou, Yunfeng
    Shan, Chao
    WEB INFORMATION SYSTEMS AND APPLICATIONS (WISA 2021), 2021, 12999 : 225 - 236
  • [38] A Performance based Empirical Study of the Frequent Itemset Mining Algorithms
    Sivakumar, Ramah
    Sathiaseelan, J. G. R.
    2017 IEEE INTERNATIONAL CONFERENCE ON POWER, CONTROL, SIGNALS AND INSTRUMENTATION ENGINEERING (ICPCSI), 2017, : 1627 - 1631
  • [39] Frequent Itemset Mining on Large-Scale Shared Memory Machines
    Zhang, Yan
    Zhang, Fan
    Bakos, Jason
    2011 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2011, : 585 - 589
  • [40] MapReduce-based Frequent Itemset Mining for Analysis of Electronic Evidence
    Jiang, Xueqing
    Sun, Guozi
    2013 EIGHTH INTERNATIONAL WORKSHOP ON SYSTEMATIC APPROACHES TO DIGITAL FORENSIC ENGINEERING (SADFE), 2013,