Combining Apriori heuristic and bio-inspired algorithms for solving the frequent itemsets mining problem

被引:100
作者
Djenouri, Youcef [1 ]
Comuzzi, Marco [1 ]
机构
[1] UNIST, 50 UNIST Gil, Ulsan 44919, South Korea
关键词
Frequent itemsets mining; Apriori heuristic; Genetic algorithm; Particle swarm optimization; HIGH-UTILITY ITEMSETS; ANT COLONY SYSTEM; ASSOCIATION RULES; GENETIC ALGORITHM; DATABASE; MINE;
D O I
10.1016/j.ins.2017.08.043
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Exact approaches to Frequent Itemsets Mining (FIM) are characterised by poor runtime performance when dealing with large database instances. Several FIM bio-inspired approaches have been proposed to overcome this issue. These are considerably more efficient from the point of view of runtime performance, but they still yield poor quality solutions. The quality of the solution, i.e., the number of frequent itemsets discovered, can be increased by improving the randomised search of the solutions space considering intrinsic features of the FIM problem. This paper proposes a new framework for FIM bio-inspired approaches that considers the recursive property of frequent itemsets, i.e., the same feature exploited by the Apriori exact heuristic, in the search of the solution space. We define two new approaches to FIM, namely GA-Apriori and PSO-Apriori, based on the proposed framework, which use genetic algorithms and particle swarm optimisation, respectively. Extensive experiments on synthetic and real database instances show that the proposed approaches outperform other bio-inspired ones in terms of runtime performance. The results also reveal that the performance of PSO-Apriori is comparable to the one of exact approaches Apriori and FPGrowth in respect of the quality of solutions found. We also show that PSO-Apriori outperforms the recently developed BATFIM algorithm when dealing with very large database instances. (C) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页码:1 / 15
页数:15
相关论文
共 34 条
[1]  
Aalst W. V. D., 2015, ACM T MANAGEMENT INF, V5, p18e
[2]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[3]   An efficient genetic algorithm for automated mining of both positive and negative quantitative association rules [J].
Alatas, B ;
Akin, E .
SOFT COMPUTING, 2006, 10 (03) :230-237
[4]   Data-intensive applications, challenges, techniques and technologies: A survey on Big Data [J].
Chen, C. L. Philip ;
Zhang, Chun-Yang .
INFORMATION SCIENCES, 2014, 275 :314-347
[5]   On the discovery of association rules by means of evolutionary algorithms [J].
del Jesus, Maria J. ;
Gamez, Jose A. ;
Gonzalez, Pedro ;
Puerta, Jose M. .
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2011, 1 (05) :397-415
[6]  
Djenouri Y., 2014, LNCS
[7]   GPU-based bees swarm optimization for association rules mining [J].
Djenouri, Youcef ;
Bendjoudi, Ahcene ;
Mehdi, Malika ;
Nouali-Taboudjemat, Nadia ;
Habbas, Zineb .
JOURNAL OF SUPERCOMPUTING, 2015, 71 (04) :1318-1344
[8]   Bees swarm optimisation using multiple strategies for association rule mining [J].
Djenouri, Youcef ;
Drias, Habiba ;
Habbas, Zineb .
INTERNATIONAL JOURNAL OF BIO-INSPIRED COMPUTATION, 2014, 6 (04) :239-249
[9]   Accelerated PSO Swarm Search Feature Selection for Data Stream Mining Big Data [J].
Fong, Simon ;
Wong, Raymond ;
Vasilakos, Athanasios V. .
IEEE TRANSACTIONS ON SERVICES COMPUTING, 2016, 9 (01) :33-45
[10]  
Gheraibia Y., 2016, J COMPUT INF TECHNOL, V24, P165