Efficient frequent itemsets mining through sampling and information granulation

被引：17

作者：

Zhang, Zhongjie ^{[1
]}

Pedrycz, Witold ^{[2
]}

Huang, Jian ^{[1
]}

机构：

[1] Natl Univ Def Technol, Coll Mech Engn & Automat, Changsha 410073, Hunan, Peoples R China

[2] Univ Alberta, Dept Elect & Comp Engn, Edmonton, AB T6R 2G7, Canada

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2017年 / 65卷

关键词：

Frequent itemsets mining; Sampling; Information granulation; PATTERN TREE; BITTABLEFI; ALGORITHM;

D O I：

10.1016/j.engappai.2017.07.016

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this study, we propose an algorithm forming high quality approximate frequent itemsets from those datasets with a large scale of transactions. The results produced by the algorithm with high probability contain all frequent itemsets, no itemset with support much lower than the minimum support is included, and supports obtained by the algorithm are close to the real values. To avoid an over-estimated sample size and a significant computing overhead, the task of reducing data is decomposed into three subproblems, and sampling and information granulation are used to solve them one by one. Firstly, the algorithm obtains rough support of every item by sampling and removes those infrequent items, so the data are simplified. Then, another sample is taken from the simplified data, and is clustered into some information granules. After data reduction, these granules obtained in this way are mined by the improved Apriori. A tight guarantee for the quality of final results is provided. The performance of the approach is quantified through a series of experiments. (C) 2017 Elsevier Ltd. All rights reserved.

引用

页码：119 / 136

页数：18

共 46 条

[1] Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[2] Hierarchical cluster ensemble selection
Akbari, Ebrahim
Dahlan, Halina Mohamed
Ibrahim, Roliana
Alizadeh, Hosein
[J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2015, 39 : 146 - 156
[3] DFP-SEPSF: A dynamic frequent pattern tree to mine strong emerging patterns in streamwise features
Alavi, Fatemeh
Hashemi, Sattar
[J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2015, 37 : 54 - 70
[4] [Anonymous], 2006, P 6 IEEE INT C COMP
[5] Bargiela A., 2012, Granular computing: an introduction, V717
[6] DBV-Miner: A Dynamic Bit-Vector approach for fast mining frequent closed itemsets
Bay Vo
Hong, Tzung-Pei
Bac Le
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (08) : 7196 - 7206
[7] Berengut D., 2012, STAT EXPT DESIGN INN
[8] Bronnimann H., 2003, KDD, P59
[9] Genome sequence of the nematode C-elegans:: A platform for investigating biology
不详
[J]. SCIENCE, 1998, 282 (5396) : 2012 - 2018
[10] Chakaravarthy V., 2009, P 12 INT C DATABASE, P276

← 1 2 3 4 5 →