Metaheuristics for data mining: survey and opportunities for big data

被引:13
作者
Dhaenens, Clarisse [1 ]
Jourdan, Laetitia [1 ]
机构
[1] Univ Lille, Cent Lille, CNRS, UMR 9189 CRIStAL, F-59000 Lille, France
关键词
Metaheuristics; Clustering; Association rules; Classification; Feature selection; Big data; PARTICLE SWARM OPTIMIZATION; FUZZY ASSOCIATION RULES; GENETIC ALGORITHM; EVOLUTIONARY ALGORITHMS; OPERATIONS-RESEARCH; FEATURE-SELECTION; DECISION TREES; EFFICIENT ALGORITHM; BAT ALGORITHM; SEARCH;
D O I
10.1007/s10479-021-04496-0
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
In the context of big data, many scientific communities aim to provide efficient approaches to accommodate large-scale datasets. This is the case of the machine-learning community, and more generally, the artificial intelligence community. The aim of this article is to explain how data mining problems can be considered as combinatorial optimization problems, and how metaheuristics can be used to address them. Four primary data mining tasks are presented: clustering, association rules, classification, and feature selection. This article follows the publication of a book in 2016 concerning this subject (Dhaenens and Jourdan in Metaheuristics for big data, Wiley, Hoboken, 2016), and an article published in 4OR (Dhaenens and Jourdan in 4OR 17 (2):115-139, 2019); additionally, updated references and an analysis of the current trends are presented.
引用
收藏
页码:117 / 140
页数:24
相关论文
共 168 条
  • [71] Holden N, 2005, 2005 IEEE SWARM INTELLIGENCE SYMPOSIUM, P100
  • [72] Holden Nicholas, 2008, Journal of Artificial Evolution & Applications, DOI 10.1155/2008/316145
  • [73] A Survey of Evolutionary Algorithms for Clustering
    Hruschka, Eduardo Raul
    Campello, Ricardo J. G. B.
    Freitas, Alex A.
    de Carvalho, Andre C. Ponce Leon F.
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2009, 39 (02): : 133 - 155
  • [74] Association rules mining using multi-objective coevolutionary algorithm
    Hu, Jian
    Yang-Li, Xiang
    [J]. CIS WORKSHOPS 2007: INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY WORKSHOPS, 2007, : 405 - 408
  • [75] A Constructive Hybrid Structure Optimization Methodology for Radial Basis Probabilistic Neural Networks
    Huang, De-Shuang
    Du, Ji-Xiang
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2008, 19 (12): : 2099 - 2115
  • [76] Igel Christian, 2005, TRENDS APPL CONSTRUC, V151, P103
  • [77] The detection of hospitalized patients at risk of testing positive to multi-drug resistant bacteria using MOCA-I, a rule-based "white-box" classification algorithm for medical data
    Jacques, Julie
    Martin-Huyghe, Helene
    Lemtiri-Florek, Justine
    Taillard, Julien
    Jourdan, Laetitia
    Dhaenens, Clarisse
    Delerue, David
    Hansske, Arnaud
    Leclercq, Valerie
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2020, 142
  • [78] Conception of a dominance-based multi-objective local search in the context of classification rule mining in large and imbalanced data sets
    Jacques, Julie
    Taillard, Julien
    Delerue, David
    Dhaenens, Clarisse
    Jourdan, Laetitia
    [J]. APPLIED SOFT COMPUTING, 2015, 34 : 705 - 720
  • [79] Jacques J, 2013, GECCO'13: PROCEEDINGS OF THE 2013 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, P543
  • [80] Automatic clustering using nature-inspired metaheuristics: A survey
    Jose-Garcia, Adan
    Gomez-Flores, Wilfrido
    [J]. APPLIED SOFT COMPUTING, 2016, 41 : 192 - 213