A fast and resource efficient mining algorithm for discovering frequent patterns in distributed computing environments

被引:16
|
作者
Lin, Kawuu W. [1 ]
Chung, Sheng-Hao [1 ]
机构
[1] Natl Kaohsiung Univ Appl Sci, Dept Comp Sci & Informat Engn, Kaohsiung 807, Taiwan
来源
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE | 2015年 / 52卷
关键词
Data mining; Frequent pattern mining; Distributed mining; Parallel mining; DATABASES;
D O I
10.1016/j.future.2015.05.009
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The advancement of electronic technology enables us to collect logs from various devices. Such logs require detailed analysis in order to be broadly useful. Data mining is a technique that has been widely used to extract hidden information from such data. Data mining is mainly composed of association rules mining, sequent pattern mining, classification and clustering. Association rules mining has attracted significant attention and been successfully applied to various fields. Although the past studies can effectively discover frequent patterns to deduce association rules, execution efficiency is still a critical problem. To speed up execution, many methods using parallel and distributed computing technology have been proposed in recent years. Most of the past studies focused on parallelizing the workload in a high end machine or in distributed computing environments like grid or cloud computing systems; however, very few of them discuss how to efficiently determine the appropriate number of computing nodes, considering execution efficiency and load balancing. An intuition is that execution speed is proportional to the number of computing nodes that is, more the number of computing nodes, faster is the execution speed. However, this is incorrect for such algorithms because of the inherently algorithmic design. Allocating too many computing nodes can lead to high execution time. In addition to the execution inefficiency, inappropriate resource allocation is a waste of computing power and network bandwidth. At the same time, load cannot be effectively distributed if there are too few nodes allocated. In this paper, we propose a fast, load balancing and resource efficient algorithm named FLR-Mining for discovering frequent patterns in distributed computing systems. FLR-Mining is capable of determining the appropriate number of computing nodes automatically and achieving better load balancing as compared with existing methods. Through empirical evaluation, FLR-Mining is shown to deliver excellent performance in terms of execution efficiency and load balancing. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:49 / 58
页数:10
相关论文
共 50 条
  • [31] An Algorithm of Mining Frequent Itemsets in Pervasive Computing
    Teng, Shaohua
    Su, Jiangyu
    Zhang, Wei
    Fu, Xiufen
    Chen, Shuqing
    2008 3RD INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND APPLICATIONS, VOLS 1 AND 2, 2008, : 561 - 565
  • [32] Mining frequent patterns securely in distributed system
    Wang, Jiahong
    Fukasawa, Takuya
    Urabe, Shintaro
    Takata, Toyoo
    Miyazaki, Masatoshi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (11): : 2739 - 2747
  • [33] An efficient and effective algorithm for mining top-rank-k frequent patterns
    Quyen Huynh-Thi-Le
    Tuong Le
    Bay Vo
    Bac Le
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (01) : 156 - 164
  • [34] An Efficient Load Balancing Multi-core Frequent Patterns Mining Algorithm
    Yu, Kun-Ming
    Wu, Shu-Hao
    TRUSTCOM 2011: 2011 INTERNATIONAL JOINT CONFERENCE OF IEEE TRUSTCOM-11/IEEE ICESS-11/FCST-11, 2011, : 1408 - 1412
  • [35] An Efficient Algorithm for Mining Frequent Patterns over High Speed Data Streams
    Meng, Cai-xia
    2009 WRI WORLD CONGRESS ON SOFTWARE ENGINEERING, VOL 1, PROCEEDINGS, 2009, : 319 - 323
  • [36] A fast algorithm for mining frequent ordered subtrees
    Hido, Shohei
    Kawano, Hiroyuki
    Systems and Computers in Japan, 2007, 38 (07) : 34 - 43
  • [37] Computing the minimum-support for mining frequent patterns
    Shichao Zhang
    Xindong Wu
    Chengqi Zhang
    Jingli Lu
    Knowledge and Information Systems, 2008, 15 : 233 - 257
  • [38] Computing the minimum-support for mining frequent patterns
    Zhang, Shichao
    Wu, Xindong
    Zhang, Chengqi
    Lu, Jingli
    KNOWLEDGE AND INFORMATION SYSTEMS, 2008, 15 (02) : 233 - 257
  • [39] An Improved Algorithm for Mining Maximal Frequent Patterns
    Hu, Yan
    Han, Ruixue
    FIRST IITA INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2009, : 746 - 749
  • [40] An incremental algorithm for mining frequent closed patterns
    Shi, Huai-Dong
    Cai, Ming
    Wu, Hong-Sen
    Dong, Jin-Xiang
    Fu, Hao
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2009, 43 (08): : 1389 - 1395