Feature selection techniques for machine learning: a survey of more than two decades of research

被引:0
作者
Dipti Theng
Kishor K. Bhoyar
机构
[1] YCCE,Department of Information Technology
[2] YCCE,undefined
来源
Knowledge and Information Systems | 2024年 / 66卷
关键词
Feature selection; Machine learning; High-dimensional data; Filter techniques; Wrapper techniques; Embedded techniques;
D O I
暂无
中图分类号
学科分类号
摘要
Learning algorithms can be less effective on datasets with an extensive feature space due to the presence of irrelevant and redundant features. Feature selection is a technique that effectively reduces the dimensionality of the feature space by eliminating irrelevant and redundant features without significantly affecting the quality of decision-making of the trained model. In the last few decades, numerous algorithms have been developed to identify the most significant features for specific learning tasks. Each algorithm has its advantages and disadvantages, and it is the responsibility of a data scientist to determine the suitability of a specific algorithm for a particular task. However, with the availability of a vast number of feature selection algorithms, selecting the appropriate one can be a daunting task for an expert. These challenges in feature selection have motivated us to analyze the properties of algorithms and dataset characteristics together. This paper presents significant efforts to review existing feature selection algorithms, providing an exhaustive analysis of their properties and relative performance. It also addresses the evolution, formulation, and usefulness of these algorithms. The manuscript further categorizes the algorithms analyzed in this review based on the properties required for a specific dataset and objective under study. Additionally, it discusses popular area-specific feature selection techniques. Finally, it identifies and discusses some open research challenges in feature selection that are yet to be overcome.
引用
收藏
页码:1575 / 1637
页数:62
相关论文
共 273 条
  • [1] Abdel-Basset M(2021)A hybrid Harris Hawks optimization algorithm with simulated annealing for feature selection Artif Intell Rev 54 593-637
  • [2] Ding W(2020)Approaches to multi-objective feature selection: a systematic literature review IEEE Access 8 125076-125096
  • [3] El-Shahat D(2020)An innovative multi-model neural network approach for feature selection in emotion recognition using deep feature clustering Sensors 20 3765-66
  • [4] Al-Tashi Q(2018)Recent advances in ensembles for feature selection Intell Syst Ref Lib 13 27-852
  • [5] Abdulkadir SJ(2012)Conditional likelihood maximisation: a unifying framework for information theoretic feature selection J Mach Learn Res 17 845-14
  • [6] Rais HM(2016)A modified t-score for feature selection Anadolu Üniv Bilim Teknol Derg A Uygul Bilimler Mühendis 2018 1-5088
  • [7] Mirjalili S(2018)A novel efficient feature dimensionality reduction method and its application in engineering Complexity 18 5079-1288
  • [8] Alhussian H(2022)Correlation-based feature selection to identify functional dynamics in proteins J Chem Theory Comput 50 1272-48
  • [9] Asghar MA(2020)Feature redundancy term variation for mutual information-based feature selection Appl Intell 333 407-44
  • [10] Khan MJ(2019)A new multi-objective wrapper method for feature selection–accuracy and stability analysis for BCI Neurocomputing 499 25-1182