A fast and resource efficient mining algorithm for discovering frequent patterns in distributed computing environments

被引：16

作者：

Lin, Kawuu W. ^{[1
]}

Chung, Sheng-Hao ^{[1
]}

机构：

[1] Natl Kaohsiung Univ Appl Sci, Dept Comp Sci & Informat Engn, Kaohsiung 807, Taiwan

来源：

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE | 2015年 / 52卷

关键词：

Data mining; Frequent pattern mining; Distributed mining; Parallel mining; DATABASES;

D O I：

10.1016/j.future.2015.05.009

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

The advancement of electronic technology enables us to collect logs from various devices. Such logs require detailed analysis in order to be broadly useful. Data mining is a technique that has been widely used to extract hidden information from such data. Data mining is mainly composed of association rules mining, sequent pattern mining, classification and clustering. Association rules mining has attracted significant attention and been successfully applied to various fields. Although the past studies can effectively discover frequent patterns to deduce association rules, execution efficiency is still a critical problem. To speed up execution, many methods using parallel and distributed computing technology have been proposed in recent years. Most of the past studies focused on parallelizing the workload in a high end machine or in distributed computing environments like grid or cloud computing systems; however, very few of them discuss how to efficiently determine the appropriate number of computing nodes, considering execution efficiency and load balancing. An intuition is that execution speed is proportional to the number of computing nodes that is, more the number of computing nodes, faster is the execution speed. However, this is incorrect for such algorithms because of the inherently algorithmic design. Allocating too many computing nodes can lead to high execution time. In addition to the execution inefficiency, inappropriate resource allocation is a waste of computing power and network bandwidth. At the same time, load cannot be effectively distributed if there are too few nodes allocated. In this paper, we propose a fast, load balancing and resource efficient algorithm named FLR-Mining for discovering frequent patterns in distributed computing systems. FLR-Mining is capable of determining the appropriate number of computing nodes automatically and achieving better load balancing as compared with existing methods. Through empirical evaluation, FLR-Mining is shown to deliver excellent performance in terms of execution efficiency and load balancing. (C) 2015 Elsevier B.V. All rights reserved.

引用

页码：49 / 58

页数：10

共 50 条

[31] An Algorithm of Mining Frequent Itemsets in Pervasive Computing
Teng, Shaohua
Su, Jiangyu
Zhang, Wei
Fu, Xiufen
Chen, Shuqing
2008 3RD INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND APPLICATIONS, VOLS 1 AND 2, 2008, : 561 - 565
[32] Mining frequent patterns securely in distributed system
Wang, Jiahong
Fukasawa, Takuya
Urabe, Shintaro
Takata, Toyoo
Miyazaki, Masatoshi
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (11): : 2739 - 2747
[33] An efficient and effective algorithm for mining top-rank-k frequent patterns
Quyen Huynh-Thi-Le
Tuong Le
Bay Vo
Bac Le
EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (01) : 156 - 164
[34] An Efficient Load Balancing Multi-core Frequent Patterns Mining Algorithm
Yu, Kun-Ming
Wu, Shu-Hao
TRUSTCOM 2011: 2011 INTERNATIONAL JOINT CONFERENCE OF IEEE TRUSTCOM-11/IEEE ICESS-11/FCST-11, 2011, : 1408 - 1412
[35] An Efficient Algorithm for Mining Frequent Patterns over High Speed Data Streams
Meng, Cai-xia
2009 WRI WORLD CONGRESS ON SOFTWARE ENGINEERING, VOL 1, PROCEEDINGS, 2009, : 319 - 323
[36] A fast algorithm for mining frequent ordered subtrees
Hido, Shohei
Kawano, Hiroyuki
Systems and Computers in Japan, 2007, 38 (07) : 34 - 43
[37] Computing the minimum-support for mining frequent patterns
Shichao Zhang
Xindong Wu
Chengqi Zhang
Jingli Lu
Knowledge and Information Systems, 2008, 15 : 233 - 257
[38] Computing the minimum-support for mining frequent patterns
Zhang, Shichao
Wu, Xindong
Zhang, Chengqi
Lu, Jingli
KNOWLEDGE AND INFORMATION SYSTEMS, 2008, 15 (02) : 233 - 257
[39] An Improved Algorithm for Mining Maximal Frequent Patterns
Hu, Yan
Han, Ruixue
FIRST IITA INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2009, : 746 - 749
[40] An incremental algorithm for mining frequent closed patterns
Shi, Huai-Dong
Cai, Ming
Wu, Hong-Sen
Dong, Jin-Xiang
Fu, Hao
Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2009, 43 (08): : 1389 - 1395

← 1 2 3 4 5 →