Multi-Objective Optimization for High-Dimensional Maximal Frequent Itemset Mining

被引:5
作者
Zhang, Yalong [1 ]
Yu, Wei [1 ]
Ma, Xuan [2 ]
Ogura, Hisakazu [3 ]
Ye, Dongfen [1 ]
机构
[1] Quzhou Univ, Coll Elect & Informat Engn, Quzhou, Peoples R China
[2] Xian Univ Technol, Fac Automat & Informat Engn, Xian 710048, Peoples R China
[3] Univ Fukui, Grad Sch Engn, Fukui 9108507, Japan
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 19期
关键词
association rules; frequent itemset mining; big data; multi-objective optimization; maximal frequent itemset; ALGORITHM;
D O I
10.3390/app11198971
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The solution space of a frequent itemset generally presents exponential explosive growth because of the high-dimensional attributes of big data. However, the premise of the big data association rule analysis is to mine the frequent itemset in high-dimensional transaction sets. Traditional and classical algorithms such as the Apriori and FP-Growth algorithms, as well as their derivative algorithms, are unacceptable in practical big data analysis in an explosive solution space because of their huge consumption of storage space and running time. A multi-objective optimization algorithm was proposed to mine the frequent itemset of high-dimensional data. First, all frequent 2-itemsets were generated by scanning transaction sets based on which new items were added in as the objects of population evolution. Algorithms aim to search for the maximal frequent itemset to gather more non-void subsets because non-void subsets of frequent itemsets are all properties of frequent itemsets. During the operation of algorithms, lethal gene fragments in individuals were recorded and eliminated so that individuals may resurge. Finally, the set of the Pareto optimal solution of the frequent itemset was gained. All non-void subsets of these solutions were frequent itemsets, and all supersets are non-frequent itemsets. Finally, the practicability and validity of the proposed algorithm in big data were proven by experiments.
引用
收藏
页数:15
相关论文
共 30 条
  • [1] KEEL: a software tool to assess evolutionary algorithms for data mining problems
    Alcala-Fdez, J.
    Sanchez, L.
    Garcia, S.
    del Jesus, M. J.
    Ventura, S.
    Garrell, J. M.
    Otero, J.
    Romero, C.
    Bacardit, J.
    Rivas, V. M.
    Fernandez, J. C.
    Herrera, F.
    [J]. SOFT COMPUTING, 2009, 13 (03) : 307 - 318
  • [2] Handling stakeholder conflict by agile requirement prioritization using Apriori technique
    Anand, R. Vijay
    Dinakaran, M.
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2017, 61 : 126 - 136
  • [3] Mining frequent itemsets from streaming transaction data using genetic algorithms
    Bagui, Sikha
    Stanley, Patrick
    [J]. JOURNAL OF BIG DATA, 2020, 7 (01)
  • [4] Discovering Relaxed Functional Dependencies Based on Multi-Attribute Dominance
    Caruccio, Loredana
    Deufemia, Vincenzo
    Naumann, Felix
    Polese, Giuseppe
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (09) : 3212 - 3228
  • [5] Mining relaxed functional dependencies from data
    Caruccio, Loredana
    Deufemia, Vincenzo
    Polese, Giuseppe
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 34 (02) : 443 - 477
  • [6] Dong DW, 2019, INT WORKSH INT DATA, P458, DOI [10.1109/IDAACS.2019.8924290, 10.1109/idaacs.2019.8924290]
  • [7] Fang Liu, 2019, 2019 12th International Symposium on Computational Intelligence and Design (ISCID). Proceedings, P167, DOI 10.1109/ISCID.2019.00045
  • [8] Mining frequent patterns without candidate generation: A frequent-pattern tree approach
    Han, JW
    Pei, J
    Yin, YW
    Mao, RY
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2004, 8 (01) : 53 - 87
  • [9] Hanirex D.K, 2013, INT J ELECT COMPUT S, V2, P251
  • [10] Heaton J, 2016, IEEE SOUTHEASTCON