A survey on swarm intelligence approaches to feature selection in data mining

被引:270
作者
Bach Hoai Nguyen [1 ]
Xue, Bing [1 ]
Zhang, Mengjie [1 ]
机构
[1] Victoria Univ Wellington, Sch Engn & Comp Sci, POB 600, Wellington 6140, New Zealand
关键词
Feature selection; Swarm intelligence; Particle swarm optimization; Ant colony optimization; Classification; ANT COLONY OPTIMIZATION; ARTIFICIAL BEE COLONY; FEATURE SUBSET-SELECTION; CUCKOO SEARCH ALGORITHM; SUPPORT VECTOR MACHINES; MUTUAL INFORMATION; GENETIC ALGORITHM; HYBRID APPROACH; LOCAL SEARCH; BINARY PSO;
D O I
10.1016/j.swevo.2020.100663
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the major problems in Big Data is a large number of features or dimensions, which causes the issue of "the curse of dimensionality" when applying machine learning, especially classification algorithms. Feature selection is an important technique which selects small and informative feature subsets to improve the learning performance. Feature selection is not an easy task due to its large and complex search space. Recently, swarm intelligence techniques have gained much attention from the feature selection community because of their simplicity and potential global search ability. However, there has been no comprehensive surveys on swarm intelligence for feature selection in classification which is the most widely investigated area in feature selection. Only a few short surveys is this area are still lack of in-depth discussions on the state-of-the-art methods, and the strengths and limitations of existing methods, particularly in terms of the representation and search mechanisms, which are two key components in adapting swarm intelligence to address feature selection problems. This paper presents a comprehensive survey on the state-of-the-art works applying swarm intelligence to achieve feature selection in classification, with a focus on the representation and search mechanisms. The expectation is to present an overview of different kinds of state-of-the-art approaches together with their advantages and disadvantages, encourage researchers to investigate more advanced methods, provide practitioners guidances for choosing the appropriate methods to be used in real-world scenarios, and discuss potential limitations and issues for future research.
引用
收藏
页数:16
相关论文
共 203 条
  • [1] Modified cuckoo search algorithm with rough sets for feature selection
    Abd El Aziz, Mohamed
    Hassanien, Aboul Ella
    [J]. NEURAL COMPUTING & APPLICATIONS, 2018, 29 (04) : 925 - 934
  • [2] A new feature selection method to improve the document clustering using particle swarm optimization algorithm
    Abualigah, Laith Mohammad
    Khader, Ahamad Tajudin
    Hanandeh, Essam Said
    [J]. JOURNAL OF COMPUTATIONAL SCIENCE, 2018, 25 : 456 - 466
  • [3] Image steganalysis using improved particle swarm optimization based feature selection
    Adeli, Ali
    Broumandnia, Ali
    [J]. APPLIED INTELLIGENCE, 2018, 48 (06) : 1609 - 1622
  • [4] Agarwal V, 2015, INT CONF CONTEMP, P257, DOI 10.1109/IC3.2015.7346689
  • [5] Agrawal V, 2015, INT CONF CONTEMP, P171, DOI 10.1109/IC3.2015.7346674
  • [6] Akaike H., 1973, P 2 INT S INF THEOR, P267, DOI 10.1007/978-1-4612-1694-0
  • [7] Akila M, 2012, ADV INTEL SOFT COMPU, V131, P813
  • [8] Al-Ani A, 2005, PROC WRLD ACAD SCI E, V4, P35
  • [9] Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms
    Alba, Enrique
    Garcia-Nieto, Jose
    Jourdan, Laetitia
    Talbi, El-Ghazali
    [J]. 2007 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-10, PROCEEDINGS, 2007, : 284 - +
  • [10] Stochastic local search for the FEATURE SET problem, with applications to microarray data
    Albrecht, Andreas A.
    [J]. APPLIED MATHEMATICS AND COMPUTATION, 2006, 183 (02) : 1148 - 1164