A New Parallel Algorithm for the Frequent Itemset Mining Problem

被引:3
|
作者
Craus, Mitica [1 ]
机构
[1] Gh Asachi Tech Univ, Dept Comp Sci & Engn, Iasi 700050, Romania
来源
PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING | 2008年
关键词
D O I
10.1109/ISPDC.2008.45
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A new parallel algorithm for finding the frequent itemsets in databases is presented. It differs fundamentally of well known Apriori algorithm, where at the beginning of every step, the dimension of the new frequent itemsets increases by I. In our algorithm the frequent itemsets are determined by progressively enlarging the interval which the individual items appertain, i.e. if at the k-th step the new candidates are from [i, i + k] intervals, i = 1, 2...., n - k, at the next step, k + 1 1, the new candidates will belong to [i, i + k + 1] intervals, i = 1, 2,..., n - k - 1. The frequent individual items are identified by their index. The basic idea is that the new frequent itemsets with individual items from the interval [i, j], simultaneously contain the items i and j. The frequent itemsets are built by sharing the work between n processors. Hereby, the processor P(i) computes, step by step, the sets F(i,j) of the frequent itemsets with individual items from the intervals [i, j], j i,..., n. In order to compute the set F(i,j), the processing unit Pi uses F(i,j-1) obtained in the previous step and F(i+1,j) received from the processor P(i+1). The main advantage of our parallel algorithm is that it uses a communication pattern known before algorithm start, which allows mapping communication to hardware. Another major advantage is that the set of the transactions can be distributed to processors prior to beginning. This is possible because a processor Pi has to compute F(i,j), j = i,..., n and therefore only the transactions containing the frequent item i are needed.
引用
收藏
页码:165 / 170
页数:6
相关论文
共 50 条
  • [1] A parallel algorithm for frequent itemset mining
    Li, L
    Zhai, DH
    Fan, J
    PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PDCAT'2003, PROCEEDINGS, 2003, : 868 - 871
  • [2] A Generalized Parallel Algorithm for Frequent Itemset Mining
    Craus, Mitica
    Archip, Alexandru
    PROCEEDINGS OF THE 12TH WSEAS INTERNATIONAL CONFERENCE ON COMPUTERS , PTS 1-3: NEW ASPECTS OF COMPUTERS, 2008, : 520 - +
  • [3] A Highly Parallel Algorithm for Frequent Itemset Mining
    Mesa, Alejandro
    Feregrino-Uribe, Claudia
    Cumplido, Rene
    Hernandez-Palancar, Jose
    ADVANCES IN PATTERN RECOGNITION, 2010, 6256 : 291 - +
  • [4] YAFIM: A Parallel Frequent Itemset Mining Algorithm with Spark
    Qiu, Hongjian
    Gu, Rong
    Yuan, Chunfeng
    Huang, Yihua
    PROCEEDINGS OF 2014 IEEE INTERNATIONAL PARALLEL & DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2014, : 1664 - 1671
  • [5] A novel parallel frequent itemset mining algorithm for automatic enterprise
    Mao, Yimin
    Wu, Bin
    Deng, Qianhu
    Mahmoodi, Soroosh
    Chen, Zhigang
    Chen, Yeh-Cheng
    ENTERPRISE INFORMATION SYSTEMS, 2023, 17 (10)
  • [6] A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
    Fumarola, Fabio
    Malerba, Donato
    2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2014, : 335 - 342
  • [7] A Novel Parallel Algorithm for Frequent Itemset Mining of Incremental Dataset
    Xu, Lijun
    Zhang, Yun
    2015 2ND INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING ICISCE 2015, 2015, : 41 - 44
  • [8] Frequent itemset mining with parallel RDBMS
    Shang, XQ
    Sattler, KU
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2005, 3518 : 539 - 544
  • [9] PFIMD: a parallel MapReduce-based algorithm for frequent itemset mining
    Mao, Yimin
    Geng, Junhao
    Mwakapesa, Deborah Simon
    Nanehkaran, Yaser Ahangari
    Chi, Zhang
    Deng, Xiaoheng
    Chen, Zhigang
    MULTIMEDIA SYSTEMS, 2021, 27 (04) : 709 - 722
  • [10] PFIMD: a parallel MapReduce-based algorithm for frequent itemset mining
    Mao Yimin
    Geng Junhao
    Deborah Simon Mwakapesa
    Yaser Ahangari Nanehkaran
    Zhang Chi
    Deng Xiaoheng
    Chen Zhigang
    Multimedia Systems, 2021, 27 : 709 - 722