A fast and resource efficient mining algorithm for discovering frequent patterns in distributed computing environments

被引：16

作者：

Lin, Kawuu W. ^{[1
]}

Chung, Sheng-Hao ^{[1
]}

机构：

[1] Natl Kaohsiung Univ Appl Sci, Dept Comp Sci & Informat Engn, Kaohsiung 807, Taiwan

来源：

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE | 2015年 / 52卷

关键词：

Data mining; Frequent pattern mining; Distributed mining; Parallel mining; DATABASES;

D O I：

10.1016/j.future.2015.05.009

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

The advancement of electronic technology enables us to collect logs from various devices. Such logs require detailed analysis in order to be broadly useful. Data mining is a technique that has been widely used to extract hidden information from such data. Data mining is mainly composed of association rules mining, sequent pattern mining, classification and clustering. Association rules mining has attracted significant attention and been successfully applied to various fields. Although the past studies can effectively discover frequent patterns to deduce association rules, execution efficiency is still a critical problem. To speed up execution, many methods using parallel and distributed computing technology have been proposed in recent years. Most of the past studies focused on parallelizing the workload in a high end machine or in distributed computing environments like grid or cloud computing systems; however, very few of them discuss how to efficiently determine the appropriate number of computing nodes, considering execution efficiency and load balancing. An intuition is that execution speed is proportional to the number of computing nodes that is, more the number of computing nodes, faster is the execution speed. However, this is incorrect for such algorithms because of the inherently algorithmic design. Allocating too many computing nodes can lead to high execution time. In addition to the execution inefficiency, inappropriate resource allocation is a waste of computing power and network bandwidth. At the same time, load cannot be effectively distributed if there are too few nodes allocated. In this paper, we propose a fast, load balancing and resource efficient algorithm named FLR-Mining for discovering frequent patterns in distributed computing systems. FLR-Mining is capable of determining the appropriate number of computing nodes automatically and achieving better load balancing as compared with existing methods. Through empirical evaluation, FLR-Mining is shown to deliver excellent performance in terms of execution efficiency and load balancing. (C) 2015 Elsevier B.V. All rights reserved.

引用

页码：49 / 58

页数：10

共 50 条

[41] Parallel Frequent Patterns Mining Algorithm on GPU
Zhou, Jiayi
Yu, Kun-Ming
Wu, Bin-Chang
2010 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2010), 2010,
[42] SPADE: An efficient algorithm for mining frequent sequences
Zaki, MJ
MACHINE LEARNING, 2001, 42 (1-2) : 31 - 60
[43] An efficient algorithm for fuzzy frequent itemset mining
Wu, Tsu-Yang
Lin, Jerry Chun-Wei
Yun, Unil
Chen, Chun-Hao
Srivastava, Gautam
Lv, Xianbiao
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (05) : 5787 - 5797
[44] SPADE: An Efficient Algorithm for Mining Frequent Sequences
Mohammed J. Zaki
Machine Learning, 2001, 42 : 31 - 60
[45] BitTableFI: An efficient mining frequent itemsets algorithm
Dong, Jie
Han, Min
KNOWLEDGE-BASED SYSTEMS, 2007, 20 (04) : 329 - 335
[46] An Efficient Algorithm for Mining Frequent Closed Itemsets
Fang, Gang
Wu, Yue
Li, Ming
Chen, Jia
INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2015, 39 (01): : 87 - 98
[47] Fast and memory efficient mining of frequent closed itemsets
Lucchese, C
Orlando, S
Perego, R
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (01) : 21 - 36
[48] DiffNodesets: An efficient structure for fast mining frequent itemsets
Deng, Zhi-Hong
APPLIED SOFT COMPUTING, 2016, 41 : 214 - 223
[49] An Efficient Algorithm for Discovering Positive and Negative Patterns
Singh, Raj
Johnsten, Tom
Raghavan, Vijay
Xie, Ying
2009 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING ( GRC 2009), 2009, : 507 - +
[50] A fast and highly scalable frequent pattern mining algorithm
Cheng, Wan-Shu
Lin, Yi-Ting
Huang, Peng-Yu
Chen, Ju-Chin
Lin, Kawuu W.
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 160 : 854 - 868

← 1 2 3 4 5 →