A framework for feature selection through boosting

被引:78
|
作者
Alsahaf, Ahmad [1 ]
Petkov, Nicolai [1 ]
Shenoy, Vikram [2 ]
Azzopardi, George [1 ]
机构
[1] Univ Groningen, Bernoulli Inst Math Comp Sci & Artificial Intelli, POB 407, NL-9700 AK Groningen, Netherlands
[2] Northeastern Univ, Khoury Coll Comp Sci, West Village Residence Complex H, Boston, MA 02115 USA
关键词
Feature selection; Boosting; Ensemble learning; XGBoost; MUTUAL INFORMATION; OPTIMIZATION;
D O I
10.1016/j.eswa.2021.115895
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As dimensions of datasets in predictive modelling continue to grow, feature selection becomes increasingly practical. Datasets with complex feature interactions and high levels of redundancy still present a challenge to existing feature selection methods. We propose a novel framework for feature selection that relies on boosting, or sample re-weighting, to select sets of informative features in classification problems. The method uses as its basis the feature rankings derived from fast and scalable tree-boosting models, such as XGBoost. We compare the proposed method to standard feature selection algorithms on 9 benchmark datasets. We show that the proposed approach reaches higher accuracies with fewer features on most of the tested datasets, and that the selected features have lower redundancy.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] A Feature Selection Framework Based on Supervised Data Clustering
    Liu, Hongzhi
    Fu, Bin
    Jiang, Zhengshen
    Wu, Zhonghai
    Hsu, D. Frank
    2016 IEEE 15TH INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS & COGNITIVE COMPUTING (ICCI*CC), 2016, : 316 - 321
  • [22] Feature Selection and Implementation of IDS using Boosting algorithm
    Shrivastava, Utpal
    Sharma, Neelam
    2020 INTERNATIONAL CONFERENCE ON COMPUTATIONAL PERFORMANCE EVALUATION (COMPE-2020), 2020, : 853 - 858
  • [23] Extensions to Online Feature Selection Using Bagging and Boosting
    Ditzler, Gregory
    LaBarck, Joseph
    Ritchie, James
    Rosen, Gail
    Polikar, Robi
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (09) : 4504 - 4509
  • [24] A Multiform Optimization Framework for Multiobjective Feature Selection in Classification
    Liang, Jing
    Zhang, Yuyang
    Qu, Boyang
    Chen, Ke
    Yu, Kunjie
    Yue, Caitong
    IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2024, 28 (04) : 1024 - 1038
  • [25] An adaptive boosting algorithm based on weighted feature selection and category classification confidence
    Wang, Youwei
    Feng, Lizhou
    APPLIED INTELLIGENCE, 2021, 51 (10) : 6837 - 6858
  • [26] An adaptive boosting algorithm based on weighted feature selection and category classification confidence
    Youwei Wang
    Lizhou Feng
    Applied Intelligence, 2021, 51 : 6837 - 6858
  • [27] An integrated feature ranking and selection framework for ADHD characterization
    Xiao C.
    Bledsoe J.
    Wang S.
    Chaovalitwongse W.A.
    Mehta S.
    Semrud-Clikeman M.
    Grabowski T.
    Brain Informatics, 2016, 3 (3) : 145 - 155
  • [28] Contrasting Undersampled Boosting with Internal and External Feature Selection for Patient Response Datasets
    Khoshgoftaar, Taghi M.
    Dittman, David J.
    Wald, Randall
    Napolitano, Amri
    2013 12TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2013), VOL 2, 2013, : 404 - 410
  • [29] Boosting Color Feature Selection for Color Face Recognition
    Choi, Jae Young
    Ro, Yong Man
    Plataniotis, Konstantinos N.
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2011, 20 (05) : 1425 - 1434
  • [30] Feature selection based on artificial bee colony and gradient boosting decision tree
    Rao, Haidi
    Shi, Xianzhang
    Rodrigue, Ahoussou Kouassi
    Feng, Juanjuan
    Xia, Yingchun
    Elhoseny, Mohamed
    Yuan, Xiaohui
    Gu, Lichuan
    APPLIED SOFT COMPUTING, 2019, 74 : 634 - 642