Multi-Objective Optimization for High-Dimensional Maximal Frequent Itemset Mining

被引：5

作者：

Zhang, Yalong ^{[1
]}

Yu, Wei ^{[1
]}

Ma, Xuan ^{[2
]}

Ogura, Hisakazu ^{[3
]}

Ye, Dongfen ^{[1
]}

机构：

[1] Quzhou Univ, Coll Elect & Informat Engn, Quzhou, Peoples R China

[2] Xian Univ Technol, Fac Automat & Informat Engn, Xian 710048, Peoples R China

[3] Univ Fukui, Grad Sch Engn, Fukui 9108507, Japan

来源：

APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 19期

关键词：

association rules; frequent itemset mining; big data; multi-objective optimization; maximal frequent itemset; ALGORITHM;

D O I：

10.3390/app11198971

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

The solution space of a frequent itemset generally presents exponential explosive growth because of the high-dimensional attributes of big data. However, the premise of the big data association rule analysis is to mine the frequent itemset in high-dimensional transaction sets. Traditional and classical algorithms such as the Apriori and FP-Growth algorithms, as well as their derivative algorithms, are unacceptable in practical big data analysis in an explosive solution space because of their huge consumption of storage space and running time. A multi-objective optimization algorithm was proposed to mine the frequent itemset of high-dimensional data. First, all frequent 2-itemsets were generated by scanning transaction sets based on which new items were added in as the objects of population evolution. Algorithms aim to search for the maximal frequent itemset to gather more non-void subsets because non-void subsets of frequent itemsets are all properties of frequent itemsets. During the operation of algorithms, lethal gene fragments in individuals were recorded and eliminated so that individuals may resurge. Finally, the set of the Pareto optimal solution of the frequent itemset was gained. All non-void subsets of these solutions were frequent itemsets, and all supersets are non-frequent itemsets. Finally, the practicability and validity of the proposed algorithm in big data were proven by experiments.

引用

页数：15

共 30 条

[1] KEEL: a software tool to assess evolutionary algorithms for data mining problems
Alcala-Fdez, J.
Sanchez, L.
Garcia, S.
del Jesus, M. J.
Ventura, S.
Garrell, J. M.
Otero, J.
Romero, C.
Bacardit, J.
Rivas, V. M.
Fernandez, J. C.
Herrera, F.
[J]. SOFT COMPUTING, 2009, 13 (03) : 307 - 318
[2] Handling stakeholder conflict by agile requirement prioritization using Apriori technique
Anand, R. Vijay
Dinakaran, M.
[J]. COMPUTERS & ELECTRICAL ENGINEERING, 2017, 61 : 126 - 136
[3] Mining frequent itemsets from streaming transaction data using genetic algorithms
Bagui, Sikha
Stanley, Patrick
[J]. JOURNAL OF BIG DATA, 2020, 7 (01)
[4] Discovering Relaxed Functional Dependencies Based on Multi-Attribute Dominance
Caruccio, Loredana
Deufemia, Vincenzo
Naumann, Felix
Polese, Giuseppe
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (09) : 3212 - 3228
[5] Mining relaxed functional dependencies from data
Caruccio, Loredana
Deufemia, Vincenzo
Polese, Giuseppe
[J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 34 (02) : 443 - 477
[6] Dong DW, 2019, INT WORKSH INT DATA, P458, DOI [10.1109/IDAACS.2019.8924290, 10.1109/idaacs.2019.8924290]
[7] Fang Liu, 2019, 2019 12th International Symposium on Computational Intelligence and Design (ISCID). Proceedings, P167, DOI 10.1109/ISCID.2019.00045
[8] Mining frequent patterns without candidate generation: A frequent-pattern tree approach
Han, JW
Pei, J
Yin, YW
Mao, RY
[J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2004, 8 (01) : 53 - 87
[9] Hanirex D.K, 2013, INT J ELECT COMPUT S, V2, P251
[10] Heaton J, 2016, IEEE SOUTHEASTCON

← 1 2 3 →