Maximal cliques-based hybrid high-dimensional feature selection with interaction screening for regression

被引:1
作者
Chamlal, Hasna [1 ]
Benzmane, Asmaa [1 ]
Ouaderhman, Tayeb [1 ]
机构
[1] Hassan II Univ Casablanca, Fac Sci Ain Chock, Dept Math & Informat, Casablanca, Morocco
关键词
Feature selection; Maximal clique; Rank correlation; High-dimensional data; Regression; GENERALIZED LINEAR-MODELS; GENETIC ALGORITHM; KOLMOGOROV FILTER; MARKOV BLANKET; BREAST-CANCER; EXPRESSION; REGULARIZATION; SEARCH;
D O I
10.1016/j.neucom.2024.128361
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Studies on feature selection have been extensively conducted in the literature, as it plays a significant role in both supervised and unsupervised machine learning tasks. Since the bulk of features in high-dimensional data sets might not be significant, feature selection plays a key role in removing unimportant variables and improving prediction and data analysis performance. Many of the current feature selection methods, meanwhile, become ineffective when used on contemporary datasets, which exhibit an escalating number of features in relation to sample size. This paper introduces a novel supervised feature selection method for regression problems. The proposed algorithm is called maximal Clique with Interaction Screening (ISClique). The ISClique algorithm's overall structure can be described in two steps. Initially, a filter approach is used to select relevant features from an initial feature space and examine the different interactions between them. This is done using an innovative coefficient based on Kendall's tau and partial Kendall's tau. Secondly, the maximal clique strategy is applied as a wrapper to the selected set from the previous step to construct subsets of features. The most optimal subset that minimizes prediction error is selected. The proposed method integrates the advantages of graph theory with feature screening. Additionally, because the criteria employed in developing the ISClique method accommodate variable heterogeneity, this method is equally suitable for classification tasks. The proposed hybrid approach has been evaluated through applications involving various simulation scenarios and real datasets. Experimental findings demonstrate the advantages of ISClique over comparable methods.
引用
收藏
页数:22
相关论文
共 50 条
  • [21] Tournament screening cum EBIC for feature selection with high-dimensional feature spaces
    Chen Zehua
    Chen JiaHua
    SCIENCE IN CHINA SERIES A-MATHEMATICS, 2009, 52 (06): : 1327 - 1341
  • [22] Feature selection using symmetric uncertainty and hybrid optimization for high-dimensional data
    Sun, Lin
    Sun, Shujing
    Ding, Weiping
    Huang, Xinyue
    Fan, Peiyi
    Li, Kunyu
    Chen, Leqi
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (12) : 4339 - 4360
  • [23] Tournament screening cum EBIC for feature selection with high-dimensional feature spaces
    ZeHua Chen
    JiaHua Chen
    Science in China Series A: Mathematics, 2009, 52 : 1327 - 1341
  • [24] A filter feature selection for high-dimensional data
    Janane, Fatima Zahra
    Ouaderhman, Tayeb
    Chamlal, Hasna
    JOURNAL OF ALGORITHMS & COMPUTATIONAL TECHNOLOGY, 2023, 17
  • [25] Bayesian feature selection in high-dimensional regression in presence of correlated noise
    Feldman, Guy
    Bhadra, Anindya
    Kirshner, Sergey
    STAT, 2014, 3 (01): : 258 - 272
  • [26] FEATURE SELECTION FOR HIGH-DIMENSIONAL DATA ANALYSIS
    Verleysen, Michel
    ECTA 2011/FCTA 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON EVOLUTIONARY COMPUTATION THEORY AND APPLICATIONS AND INTERNATIONAL CONFERENCE ON FUZZY COMPUTATION THEORY AND APPLICATIONS, 2011,
  • [27] FEATURE SELECTION FOR HIGH-DIMENSIONAL DATA ANALYSIS
    Verleysen, Michel
    NCTA 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON NEURAL COMPUTATION THEORY AND APPLICATIONS, 2011, : IS23 - IS25
  • [28] A new representation in genetic programming with hybrid feature ranking criterion for high-dimensional feature selection
    Li, Jiayi
    Zhang, Fan
    Ma, Jianbin
    COMPLEX & INTELLIGENT SYSTEMS, 2025, 11 (04)
  • [29] Enhanced NSGA-II-based feature selection method for high-dimensional classification
    Li, Min
    Ma, Huan
    Lv, Siyu
    Wang, Lei
    Deng, Shaobo
    INFORMATION SCIENCES, 2024, 663
  • [30] Minimax Sparse Logistic Regression for Very High-Dimensional Feature Selection
    Tan, Mingkui
    Tsang, Ivor W.
    Wang, Li
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2013, 24 (10) : 1609 - 1622