Maximal cliques-based hybrid high-dimensional feature selection with interaction screening for regression

被引:1
作者
Chamlal, Hasna [1 ]
Benzmane, Asmaa [1 ]
Ouaderhman, Tayeb [1 ]
机构
[1] Hassan II Univ Casablanca, Fac Sci Ain Chock, Dept Math & Informat, Casablanca, Morocco
关键词
Feature selection; Maximal clique; Rank correlation; High-dimensional data; Regression; GENERALIZED LINEAR-MODELS; GENETIC ALGORITHM; KOLMOGOROV FILTER; MARKOV BLANKET; BREAST-CANCER; EXPRESSION; REGULARIZATION; SEARCH;
D O I
10.1016/j.neucom.2024.128361
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Studies on feature selection have been extensively conducted in the literature, as it plays a significant role in both supervised and unsupervised machine learning tasks. Since the bulk of features in high-dimensional data sets might not be significant, feature selection plays a key role in removing unimportant variables and improving prediction and data analysis performance. Many of the current feature selection methods, meanwhile, become ineffective when used on contemporary datasets, which exhibit an escalating number of features in relation to sample size. This paper introduces a novel supervised feature selection method for regression problems. The proposed algorithm is called maximal Clique with Interaction Screening (ISClique). The ISClique algorithm's overall structure can be described in two steps. Initially, a filter approach is used to select relevant features from an initial feature space and examine the different interactions between them. This is done using an innovative coefficient based on Kendall's tau and partial Kendall's tau. Secondly, the maximal clique strategy is applied as a wrapper to the selected set from the previous step to construct subsets of features. The most optimal subset that minimizes prediction error is selected. The proposed method integrates the advantages of graph theory with feature screening. Additionally, because the criteria employed in developing the ISClique method accommodate variable heterogeneity, this method is equally suitable for classification tasks. The proposed hybrid approach has been evaluated through applications involving various simulation scenarios and real datasets. Experimental findings demonstrate the advantages of ISClique over comparable methods.
引用
收藏
页数:22
相关论文
共 50 条
  • [31] A Feature Subset Selection Method Based On High-Dimensional Mutual Information
    Zheng, Yun
    Kwoh, Chee Keong
    ENTROPY, 2011, 13 (04) : 860 - 901
  • [32] A GA-based Feature Selection for High-dimensional Data Clustering
    Sun, Mei
    Xiong, Langhuan
    Sun, Haojun
    Jiang, Dazhi
    THIRD INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTING, 2009, : 769 - 772
  • [33] Feature selection for high-dimensional regression via sparse LSSVR based on Lp-norm
    Li, Chun-Na
    Shao, Yuan-Hai
    Zhao, Da
    Guo, Yan-Ru
    Hua, Xiang-Yu
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2021, 36 (02) : 1108 - 1130
  • [34] An efficient hybrid sine-cosine Harris hawks optimization for low and high-dimensional feature selection
    Hussain, Kashif
    Neggaz, Nabil
    Zhu, William
    Houssein, Essam H.
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 176
  • [35] Deep feature screening: Feature selection for ultra high-dimensional data via deep neural networks
    Li, Kexuan
    Wang, Fangfang
    Yang, Lingli
    Liu, Ruiqi
    NEUROCOMPUTING, 2023, 538
  • [36] Graph-based Feature Selection Filter Utilizing Maximal Cliques
    Schroeder, Daniel Thilo
    Styp-Rekowski, Kevin
    Schmidt, Florian
    Acker, Alexander
    Kao, Odej
    2019 SIXTH INTERNATIONAL CONFERENCE ON SOCIAL NETWORKS ANALYSIS, MANAGEMENT AND SECURITY (SNAMS), 2019, : 297 - 302
  • [37] A two-stage hybrid ant colony optimization for high-dimensional feature selection
    Ma, Wenping
    Zhou, Xiaobo
    Zhu, Hao
    Li, Longwei
    Jiao, Licheng
    PATTERN RECOGNITION, 2021, 116
  • [38] Feature selection for high-dimensional data
    Destrero A.
    Mosci S.
    De Mol C.
    Verri A.
    Odone F.
    Computational Management Science, 2009, 6 (1) : 25 - 40
  • [39] Partial profile score feature selection in high-dimensional generalized linear interaction models
    Xu, Zengchao
    Luo, Shan
    Chen, Zehua
    STATISTICS AND ITS INTERFACE, 2022, 15 (04) : 433 - 447
  • [40] Improving Evolutionary Algorithm Performance for Feature Selection in High-Dimensional Data
    Cilia, N.
    De Stefano, C.
    Fontanella, F.
    di Freca, A. Scotto
    APPLICATIONS OF EVOLUTIONARY COMPUTATION, EVOAPPLICATIONS 2018, 2018, 10784 : 439 - 454