A framework for feature selection through boosting

被引:78
|
作者
Alsahaf, Ahmad [1 ]
Petkov, Nicolai [1 ]
Shenoy, Vikram [2 ]
Azzopardi, George [1 ]
机构
[1] Univ Groningen, Bernoulli Inst Math Comp Sci & Artificial Intelli, POB 407, NL-9700 AK Groningen, Netherlands
[2] Northeastern Univ, Khoury Coll Comp Sci, West Village Residence Complex H, Boston, MA 02115 USA
关键词
Feature selection; Boosting; Ensemble learning; XGBoost; MUTUAL INFORMATION; OPTIMIZATION;
D O I
10.1016/j.eswa.2021.115895
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As dimensions of datasets in predictive modelling continue to grow, feature selection becomes increasingly practical. Datasets with complex feature interactions and high levels of redundancy still present a challenge to existing feature selection methods. We propose a novel framework for feature selection that relies on boosting, or sample re-weighting, to select sets of informative features in classification problems. The method uses as its basis the feature rankings derived from fast and scalable tree-boosting models, such as XGBoost. We compare the proposed method to standard feature selection algorithms on 9 benchmark datasets. We show that the proposed approach reaches higher accuracies with fewer features on most of the tested datasets, and that the selected features have lower redundancy.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] A general framework for boosting feature subset selection algorithms
    Perez-Rodriguez, Javier
    de Haro-Garcia, Aida
    Romero del Castillo, Juan A.
    Garcia-Pedrajas, Nicolas
    INFORMATION FUSION, 2018, 44 : 147 - 175
  • [2] Evolutionary feature selection in boosting
    Matsui, K
    Sato, H
    2004 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN & CYBERNETICS, VOLS 1-7, 2004, : 4780 - 4785
  • [3] Framework for the Ensemble of Feature Selection Methods
    Mera-Gaona, Maritza
    Lopez, Diego M.
    Vargas-Canas, Rubiel
    Neumann, Ursula
    APPLIED SCIENCES-BASEL, 2021, 11 (17):
  • [4] An improved boosting based on feature selection for corporate bankruptcy prediction
    Wang, Gang
    Ma, Jian
    Yang, Shanlin
    EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (05) : 2353 - 2361
  • [5] Feature selection and multiple kernel boosting framework based on PSO with mutation mechanism for hyperspectral classification
    Qi, Chengming
    Zhou, Zhangbing
    Sun, Yunchuan
    Song, Houbing
    Hu, Lishuan
    Wang, Qun
    NEUROCOMPUTING, 2017, 220 : 181 - 190
  • [6] A hybrid framework for optimal feature subset selection
    Shukla, Alok Kumar
    Singh, Pradeep
    Vardhan, Manu
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 36 (03) : 2247 - 2259
  • [7] Robust twin boosting for feature selection from high-dimensional omics data with label noise
    He, Shan
    Chen, Huanhuan
    Zhu, Zexuan
    Ward, Douglas G.
    Cooper, Helen J.
    Viant, Mark R.
    Heath, John K.
    Yao, Xin
    INFORMATION SCIENCES, 2015, 291 : 1 - 18
  • [8] Feature Weighting and Selection Using Hypothesis Margin of Boosting
    Alshawabkeh, Malak
    Aslam, Javed A.
    Dy, Jennifer G.
    Kaeli, David
    12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2012), 2012, : 41 - 50
  • [9] Boosting feature selection using information metric for classification
    Liu, Huawen
    Liu, Lei
    Zhang, Huijie
    NEUROCOMPUTING, 2009, 73 (1-3) : 295 - 303
  • [10] Feature Selection in Click-Through Rate Prediction Based on Gradient Boosting
    Wang, Zheng
    Yu, Qingsong
    Shen, Chaomin
    Hu, Wenxin
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2016, 2016, 9937 : 134 - 142