Metaheuristics for data mining: survey and opportunities for big data

被引:13
作者
Dhaenens, Clarisse [1 ]
Jourdan, Laetitia [1 ]
机构
[1] Univ Lille, Cent Lille, CNRS, UMR 9189 CRIStAL, F-59000 Lille, France
关键词
Metaheuristics; Clustering; Association rules; Classification; Feature selection; Big data; PARTICLE SWARM OPTIMIZATION; FUZZY ASSOCIATION RULES; GENETIC ALGORITHM; EVOLUTIONARY ALGORITHMS; OPERATIONS-RESEARCH; FEATURE-SELECTION; DECISION TREES; EFFICIENT ALGORITHM; BAT ALGORITHM; SEARCH;
D O I
10.1007/s10479-021-04496-0
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
In the context of big data, many scientific communities aim to provide efficient approaches to accommodate large-scale datasets. This is the case of the machine-learning community, and more generally, the artificial intelligence community. The aim of this article is to explain how data mining problems can be considered as combinatorial optimization problems, and how metaheuristics can be used to address them. Four primary data mining tasks are presented: clustering, association rules, classification, and feature selection. This article follows the publication of a book in 2016 concerning this subject (Dhaenens and Jourdan in Metaheuristics for big data, Wiley, Hoboken, 2016), and an article published in 4OR (Dhaenens and Jourdan in 4OR 17 (2):115-139, 2019); additionally, updated references and an analysis of the current trends are presented.
引用
收藏
页码:117 / 140
页数:24
相关论文
共 168 条
  • [1] Optimizing Big Data in Bioinformatics with Swarm Algorithms
    Abdul-Rahman, Shuzlina
    Abu Bakar, Azuraliza
    Mohamed-Hussein, Zeti-Azura
    [J]. 2013 IEEE 16TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2013), 2013, : 1091 - 1095
  • [2] Abubaker A, 2015, PLOS ONE, V10, DOI [10.1371/journal.pone.0130995, 10.1371/journal.pone.0135641]
  • [3] Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
  • [4] Agrawal R., 1994, PROC 20 INT C VERY L, V1215, P487, DOI DOI 10.5555/645920.672836
  • [5] Research on particle swarm optimization based clustering: A systematic review of literature and techniques
    Alam, Shafiq
    Dobbie, Gillian
    Koh, Yun Sing
    Riddle, Patricia
    Rehman, Saeed Ur
    [J]. SWARM AND EVOLUTIONARY COMPUTATION, 2014, 17 : 1 - 13
  • [6] Modenar: Multi-objective differential evolution algorithm for mining numeric association rules
    Alatas, Bilal
    Akin, Erhan
    Karci, Ali
    [J]. APPLIED SOFT COMPUTING, 2008, 8 (01) : 646 - 656
  • [7] Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms
    Alba, Enrique
    Garcia-Nieto, Jose
    Jourdan, Laetitia
    Talbi, El-Ghazali
    [J]. 2007 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-10, PROCEEDINGS, 2007, : 284 - +
  • [8] Genetic Algorithm Based Parallel K-Means Data Clustering Algorithm Using MapReduce Programming Paradigm on Hadoop Environment (GAPKCA)
    Alshammari, Sayer
    Zolkepli, Maslina Binti
    Abdullah, Rusli Bin
    [J]. RECENT ADVANCES ON SOFT COMPUTING AND DATA MINING (SCDM 2020), 2020, 978 : 98 - 108
  • [9] Anand Rajul, 2009, Proceedings of the 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC 2009), P385, DOI 10.1109/NABIC.2009.5393878
  • [10] [Anonymous], 2005, Data Mining: Concepts and Techniques