Ensemble Feature Selection With Block-Regularized m x 2 Cross-Validation

被引:2
|
作者
Yang, Xingli [1 ]
Wang, Yu [2 ]
Wang, Ruibo [2 ]
Li, Jihong [2 ]
机构
[1] Shanxi Univ, Sch Math Sci, Taiyuan 030006, Peoples R China
[2] Shanxi Univ, Sch Modern Educ Technol, Taiyuan 030006, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Correlation; Indexes; Data models; Technological innovation; Reliability theory; Upper bound; Beta distribution; block-regularized m x 2 cross-validation; ensemble feature selection (EFS); false positive; true positive; VARIABLE SELECTION; REGRESSION; PRECISION; RECALL;
D O I
10.1109/TNNLS.2021.3128173
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ensemble feature selection (EFS) has attracted significant interest in the literature due to its great potential in reducing the discovery rate of noise features and stabilizing the feature selection results. In view of the superior performance of block-regularized m x 2 cross-validation on generalization performance and algorithm comparison, a novel EFS technology based on block-regularized m x 2 cross-validation is proposed in this study. Contrary to the traditional ensemble learning with a binomial distribution, the distribution of feature selection frequency in the proposed technique is approximated by a beta distribution more accurately. Furthermore, theoretical analysis of the proposed technique shows that it yields a higher selection probability for important features, lower selected risk for noise features, more true positives, and fewer false positives. Finally, the above conclusions are verified by the simulated and real data experiments.
引用
收藏
页码:6628 / 6641
页数:14
相关论文
共 50 条
  • [1] Block-Regularized m x 2 Cross-Validated Estimator of the Generalization Error
    Wang, Ruibo
    Wang, Yu
    Li, Jihong
    Yang, Xingli
    Yang, Jing
    NEURAL COMPUTATION, 2017, 29 (02) : 519 - 554
  • [2] Feature selection increases cross-validation imprecision
    Xiao, Yufei
    Hua, Jianping
    Dougherty, Edward R.
    2006 IEEE INTERNATIONAL WORKSHOP ON GENOMIC SIGNAL PROCESSING AND STATISTICS, 2006, : 17 - +
  • [3] On Estimating Model in Feature Selection With Cross-Validation
    Qi, Chunxia
    Diao, Jiandong
    Qiu, Like
    IEEE ACCESS, 2019, 7 : 33454 - 33463
  • [4] Efficient cross-validation traversals in feature subset selection
    Lausser, Ludwig
    Szekely, Robin
    Schmid, Florian
    Maucher, Markus
    Kestler, Hans A.
    SCIENTIFIC REPORTS, 2022, 12 (01):
  • [5] Efficient cross-validation traversals in feature subset selection
    Ludwig Lausser
    Robin Szekely
    Florian Schmid
    Markus Maucher
    Hans A. Kestler
    Scientific Reports, 12 (1)
  • [6] Feature Importance in Gradient Boosting Trees with Cross-Validation Feature Selection
    Adler, Afek Ilay
    Painsky, Amichai
    ENTROPY, 2022, 24 (05)
  • [7] Measure for data partitioning in m x 2 cross-validation
    Wang, Yu
    Li, Jihong
    Li, Yanfang
    PATTERN RECOGNITION LETTERS, 2015, 65 : 211 - 217
  • [8] Nested cross-validation with ensemble feature selection and classification model for high-dimensional biological data
    Zhong, Yi
    Chalise, Prabhakar
    He, Jianghua
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2023, 52 (01) : 110 - 125
  • [9] Boosting with Cross-Validation Based Feature Selection for Pedestrian Detection
    Nishida, Kenji
    Kurita, Takio
    2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 1250 - 1256
  • [10] Feature selection for manufacturing process monitoring using cross-validation
    Shao, Chenhui
    Paynabar, Kamran
    Kim, Tae Hyung
    Jin, Jionghua
    Hu, S. Jack
    Spicer, J. Patrick
    Wang, Hui
    Abell, Jeffrey A.
    JOURNAL OF MANUFACTURING SYSTEMS, 2013, 32 (04) : 550 - 555