Artificial Intelligence based wrapper for high dimensional feature selection

被引:6
|
作者
Jain, Rahi [1 ]
Xu, Wei [2 ]
机构
[1] Princess Margaret Canc Res Ctr, Biostat Dept, Toronto, ON, Canada
[2] Univ Toronto, Dalla Lana Sch Publ Hlth, Toronto, ON, Canada
关键词
High dimensional data; Wrapper feature selection; Artificial intelligence; AIWrap; Machine learning; Interaction terms; REDUCTION; REGRESSION; LASSO; SMOKE;
D O I
10.1186/s12859-023-05502-x
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Feature selection is important in high dimensional data analysis. The wrapper approach is one of the ways to perform feature selection, but it is computationally intensive as it builds and evaluates models of multiple subsets of features. The existing wrapper algorithm primarily focuses on shortening the path to find an optimal feature set. However, it underutilizes the capability of feature subset models, which impacts feature selection and its predictive performance. Method and Results: This study proposes a novel Artificial Intelligence based Wrapper (AIWrap) algorithm that integrates Artificial Intelligence (AI) with the existing wrapper algorithm. The algorithm develops a Performance Prediction Model using AI which predicts the model performance of any feature set and allows the wrapper algorithm to evaluate the feature subset performance in a model without building the model. The algorithm can make the wrapper algorithm more relevant for high-dimensional data. We evaluate the performance of this algorithm using simulated studies and real research studies. AIWrap shows better or at par feature selection and model prediction performance than standard penalized feature selection algorithms and wrapper algorithms. Conclusion: AIWrap approach provides an alternative algorithm to the existing algorithms for feature selection. The current study focuses on AIWrap application in continuous cross-sectional data. However, it could be applied to other datasets like longitudinal, categorical and time-to-event biological data.
引用
收藏
页数:22
相关论文
共 50 条
  • [21] Wrapper-Based Federated Feature Selection for IoT Environments
    Mahanipour, Afsaneh
    Khamfroush, Hana
    2023 INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKING AND COMMUNICATIONS, ICNC, 2023, : 214 - 219
  • [22] A Machine Learning-Based Wrapper Method for Feature Selection
    Patel, Damodar
    Saxena, Amit
    Wang, John
    INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2024, 20 (01)
  • [23] Quantum based Whale Optimization Algorithm for wrapper feature selection
    Agrawal, R. K.
    Kaur, Baljeet
    Sharma, Surbhi
    APPLIED SOFT COMPUTING, 2020, 89
  • [24] Review on Wrapper Feature Selection Approaches
    El Aboudi, Naoual
    Benhlima, Laila
    2016 INTERNATIONAL CONFERENCE ON ENGINEERING & MIS (ICEMIS), 2016,
  • [25] A Weighted Wrapper Approach to Feature Selection
    Kusy, Maciej
    Zajdel, Roman
    INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2021, 31 (04) : 685 - 696
  • [26] Fast wrapper feature subset selection in high-dimensional datasets by means of filter re-ranking
    Bermejo, Pablo
    de la Ossa, Luis
    Gamez, Jose A.
    Puerta, Jose M.
    KNOWLEDGE-BASED SYSTEMS, 2012, 25 (01) : 35 - 44
  • [27] Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments
    Apolloni, Javier
    Leguizamon, Guillermo
    Alba, Enrique
    APPLIED SOFT COMPUTING, 2016, 38 : 922 - 932
  • [28] A GRASP algorithm for fast hybrid (filter-wrapper) feature subset selection in high-dimensional datasets
    Bermejo, Pablo
    Gamez, Jose A.
    Puerta, Jose M.
    PATTERN RECOGNITION LETTERS, 2011, 32 (05) : 701 - 711
  • [29] Filter based Backward Elimination in Wrapper based PSO for Feature Selection in Classification
    Hoai Bach Nguyen
    Xue, Bing
    Liu, Ivy
    Zhang, Mengjie
    2014 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2014, : 3111 - 3118
  • [30] A wrapper feature selection method for combined tree-based classifiers
    Gatnar, E
    FROM DATA AND INFORMATION ANALYSIS TO KNOWLEDGE ENGINEERING, 2006, : 119 - 125