Comparative Analysis of Genetic Based Approach and Apriori Algorithm for Mining Maximal Frequent Item Sets

被引:0
作者
Kabir, Mir Md. Jahangir [1 ,2 ]
Xu, Shuxiang [1 ,2 ]
Kang, Byeong Ho [1 ,2 ]
Zhao, Zongyuan [1 ,2 ]
机构
[1] Univ Tasmania, Sch Engn, Launceston, Tas, Australia
[2] Univ Tasmania, ICT, Launceston, Tas, Australia
来源
2015 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC) | 2015年
关键词
association rules; data mining; maximal frequent item sets; genetic algorithm; lexicographic tree;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the data mining research area, discovering frequent item sets is an important issue and key factor for mining association rules. For large datasets, a huge amount of frequent patterns are generated for a low support value, which is a major challenge in frequent pattern mining tasks. A Maximal frequent pattern mining task helps to resolve this problem since a maximal frequent pattern contains information about a large number of small frequent sub patterns. For this study we have developed a genetic based approach to find maximal frequent patterns using a user defined threshold value as a constraint. To optimize the search problems, a genetic algorithm is one of the best choices which mimics the natural selection procedure and considers global search mechanism which is good for searching solution especially when the search space is large. The use of evolutionary algorithm is also effective for undetermined solutions. Therefore, this approach uses a genetic algorithm to find maximal frequent item sets from different sorts of data sets. A low support value generates some large patterns which contain the information about huge amount of small frequent sub patterns that could be useful for mining association rules. We have applied this genetic based approach for different real data sets as well as synthetic data sets. The experimental results show that our proposed approach evaluates less nodes than the number of candidate item sets considered by Apriori algorithm, especially when the support value is set low.
引用
收藏
页码:39 / 45
页数:7
相关论文
共 23 条
[1]  
Agarwal R. C., 2000, Proceedings. KDD-2000. Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, P108, DOI 10.1145/347090.347114
[2]   A tree projection algorithm for generation of frequent item sets [J].
Agarwal, RC ;
Aggarwal, CC ;
Prasad, VVV .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2001, 61 (03) :350-371
[3]   An efficient genetic algorithm for automated mining of both positive and negative quantitative association rules [J].
Alatas, B ;
Akin, E .
SOFT COMPUTING, 2006, 10 (03) :230-237
[4]  
[Anonymous], 1992, GENETIC ALGORITHMS D, DOI DOI 10.1007/978-3-662-03315-9
[5]  
Bayardo R. J. Jr., 1998, SIGMOD Record, V27, P85, DOI 10.1145/276305.276313
[6]  
BEASLEY D, 1993, U COMPUT, V15, P58
[7]  
Burdick D., P 17 INT C DAT ENG, P443
[8]  
Golberg D. E., 1989, GENETIC ALGORITHMS S, V1989, P36
[9]   GenMax: An efficient algorithm for mining maximal frequent itemsets [J].
Gouda, K ;
Zaki, MJ .
DATA MINING AND KNOWLEDGE DISCOVERY, 2005, 11 (03) :223-242
[10]  
Hipp Jochen, 2000, ACM SIGKDD Explorations, V2, P58, DOI [10.1145/360402.360421, DOI 10.1145/360402.360421]