Frequent pattern mining on message passing multiprocessor systems

被引:61
|
作者
Javed, A [1 ]
Khokhar, A [1 ]
机构
[1] Univ Illinois, Chicago, IL 60607 USA
基金
美国国家科学基金会;
关键词
frequent pattern mining; parallel processing; association rule; data mining;
D O I
10.1023/B:DAPD.0000031634.19130.bd
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Extraction of frequent patterns in transaction-oriented database is crucial to several data mining tasks such as association rule generation, time series analysis, classification, etc. Most of these mining tasks require multiple passes over the database and if the database size is large, which is usually the case, scalable high performance solutions involving multiple processors are required. This paper presents an efficient scalable parallel algorithm for mining frequent patterns on parallel shared nothing platforms. The proposed algorithm is based on one of the best known sequential techniques referred to as Frequent Pattern (FP) Growth algorithm. Unlike most of the earlier parallel approaches based on different variants of the Apriori Algorithm, the algorithm presented in this paper does not explicitly result in having entire counting data structure duplicated on each processor. Furthermore, the proposed algorithm introduces minimum communication (and hence synchronization) overheads by efficiently partitioning the list of frequent elements list over processors. The experimental results show scalable performance over different machine and problem sizes. The comparison of implementation results with existing parallel approaches show significant gains in the speedup. On an 8-processor machine, we report an average speedup of 6 for different problem sizes.
引用
收藏
页码:321 / 334
页数:14
相关论文
共 50 条
  • [21] The Studies of Mining Frequent Patterns Based on Frequent Pattern Tree
    Yen, Show-Jane
    Lee, Yue-Shi
    Wang, Chiu-Kuang
    Wu, Jung-Wei
    Ouyang, Liang-Yu
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2009, 5476 : 232 - +
  • [22] Mining frequent patterns with incremental updating frequent pattern tree
    Zhu, Qunxiong
    Lin, Xiaoyong
    WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 5923 - +
  • [23] A New Data Structure to Enhance the Speed of Frequent Pattern Mining
    Derakhshan, Reza
    Ahmadi, Ali
    2017 25TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2017, : 2128 - 2133
  • [24] Clustering of Windows Security Events by Means of Frequent Pattern Mining
    Basagoiti, Rosa
    Zurutuza, Urko
    Aztiria, Asier
    Santafe, Guzman
    Reyes, Mario
    COMPUTATIONAL INTELLIGENCE IN SECURITY FOR INFORMATION SYSTEMS, 2009, 63 : 19 - +
  • [25] Novel Frequent Pattern Mining Algorithm based on Parallelization scheme
    Gatuha, George
    Jiang, Tao
    INTERNATIONAL JOURNAL OF ENGINEERING RESEARCH IN AFRICA, 2016, 23 : 131 - 140
  • [26] Efficient frequent pattern mining based on Linear Prefix tree
    Pyun, Gwangbum
    Yun, Unil
    Ryu, Keun Ho
    KNOWLEDGE-BASED SYSTEMS, 2014, 55 : 125 - 139
  • [27] Improved pattern tree for incremental frequent-pattern mining
    Zhou M.
    Wang T.
    Transactions of Tianjin University, 2010, 16 (2) : 129 - 134
  • [28] Investigating TYPE constraint for frequent pattern mining
    Ahmad, Munir
    Farooq, Umar
    Atta-Ur-Rahman
    Alqatari, Abdulrahman
    Dash, Sujata
    Luhach, Ashish Kr
    JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY, 2019, 22 (04) : 605 - 626
  • [29] Frequent Sequence Pattern Mining with Differential Privacy
    Zhou, Fengli
    Lin, Xiaoli
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, PT I, 2018, 10954 : 454 - 466
  • [30] The Maximal Frequent Pattern Mining of DNA Sequence
    Bai, Shuang
    Bai, Si-Xue
    2009 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING ( GRC 2009), 2009, : 23 - 26