Maximal cliques-based hybrid high-dimensional feature selection with interaction screening for regression

被引:1
作者
Chamlal, Hasna [1 ]
Benzmane, Asmaa [1 ]
Ouaderhman, Tayeb [1 ]
机构
[1] Hassan II Univ Casablanca, Fac Sci Ain Chock, Dept Math & Informat, Casablanca, Morocco
关键词
Feature selection; Maximal clique; Rank correlation; High-dimensional data; Regression; GENERALIZED LINEAR-MODELS; GENETIC ALGORITHM; KOLMOGOROV FILTER; MARKOV BLANKET; BREAST-CANCER; EXPRESSION; REGULARIZATION; SEARCH;
D O I
10.1016/j.neucom.2024.128361
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Studies on feature selection have been extensively conducted in the literature, as it plays a significant role in both supervised and unsupervised machine learning tasks. Since the bulk of features in high-dimensional data sets might not be significant, feature selection plays a key role in removing unimportant variables and improving prediction and data analysis performance. Many of the current feature selection methods, meanwhile, become ineffective when used on contemporary datasets, which exhibit an escalating number of features in relation to sample size. This paper introduces a novel supervised feature selection method for regression problems. The proposed algorithm is called maximal Clique with Interaction Screening (ISClique). The ISClique algorithm's overall structure can be described in two steps. Initially, a filter approach is used to select relevant features from an initial feature space and examine the different interactions between them. This is done using an innovative coefficient based on Kendall's tau and partial Kendall's tau. Secondly, the maximal clique strategy is applied as a wrapper to the selected set from the previous step to construct subsets of features. The most optimal subset that minimizes prediction error is selected. The proposed method integrates the advantages of graph theory with feature screening. Additionally, because the criteria employed in developing the ISClique method accommodate variable heterogeneity, this method is equally suitable for classification tasks. The proposed hybrid approach has been evaluated through applications involving various simulation scenarios and real datasets. Experimental findings demonstrate the advantages of ISClique over comparable methods.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] Hybrid fast unsupervised feature selection for high-dimensional data
    Manbari, Zhaleh
    AkhlaghianTab, Fardin
    Salavati, Chiman
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 124 : 97 - 118
  • [2] A hybrid Artificial Immune optimization for high-dimensional feature selection
    Zhu, Yongbin
    Li, Wenshan
    Li, Tao
    KNOWLEDGE-BASED SYSTEMS, 2023, 260
  • [3] Unbiased Prediction and Feature Selection in High-Dimensional Survival Regression
    Laimighofer, Michael
    Krumsiek, Jan
    Buettner, Florian
    Theis, Fabian J.
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2016, 23 (04) : 279 - 290
  • [4] A hybrid feature selection approach based on ensemble method for high-dimensional data
    Rouhi, Amirreza
    Nezamabadi-pour, Hossein
    2017 2ND CONFERENCE ON SWARM INTELLIGENCE AND EVOLUTIONARY COMPUTATION (CSIEC), 2017, : 16 - 20
  • [5] Projective inference in high-dimensional problems: Prediction and feature selection
    Piironen, Juho
    Paasiniemi, Markus
    Vehtari, Aki
    ELECTRONIC JOURNAL OF STATISTICS, 2020, 14 (01): : 2155 - 2197
  • [6] High-Dimensional LASSO-Based Computational Regression Models: Regularization, Shrinkage, and Selection
    Emmert-Streib, Frank
    Dehmer, Matthias
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2019, 1 (01): : 359 - 383
  • [7] Hybrid Feature Selection for High-Dimensional Manufacturing Data
    Sun, Yajuan
    Yu, Jianlin
    Li, Xiang
    Wu, Ji Yan
    Lu, Wen Feng
    2021 26TH IEEE INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION (ETFA), 2021,
  • [8] A hybrid feature selection method for high-dimensional data
    Taheri, Nooshin
    Nezamabadi-pour, Hossein
    2014 4TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2014, : 141 - 145
  • [9] Feature Screening for High-Dimensional Variable Selection in Generalized Linear Models
    Jiang, Jinzhu
    Shang, Junfeng
    ENTROPY, 2023, 25 (06)
  • [10] Feature selection for high-dimensional data
    Bolón-Canedo V.
    Sánchez-Maroño N.
    Alonso-Betanzos A.
    Progress in Artificial Intelligence, 2016, 5 (2) : 65 - 75