Feature Selection in High-Dimensional Space with Applications to Gene Expression Data

被引:0
|
作者
Pantha, Nishan [1 ]
Ramasubramanian, Muthukumaran [1 ]
Gurung, Iksha [1 ]
Maskey, Manil [2 ]
Sanders, Lauren M. [2 ]
Casaletto, James [2 ]
Costes, Sylvain V. [2 ]
机构
[1] Univ Alabama, Huntsville, AL 35899 USA
[2] Natl Aeronaut & Space Adm, Washington, DC USA
来源
SOUTHEASTCON 2024 | 2024年
关键词
REGRESSION;
D O I
10.1109/SOUTHEASTCON52093.2024.10500057
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent years have seen rapid growth in high-dimensional datasets. Most existing machine learning (ML) algorithms fail in high-dimensional settings where many features could be redundant. A critical process of feature selection is thus applied in such a setting that helps in identifying the most relevant features while removing redundant ones. With the increase in high dimensionality, one is also faced with problems of efficiency and interpretation in performing such selection methods. Therefore, this paper proposes a "novel" feature selection framework that uses an ensemble of interpretable ML algorithms to perform feature selection and the ranking of final features. Finally, this framework is applied to a gene expression dataset obtained through collaboration with the National Aeronautics and Space Administration (NASA)'s Biological and Physical Sciences (BPS) team and helps identify important and relevant genes contributing to specific target attributes through classification tasks.
引用
收藏
页码:6 / 15
页数:10
相关论文
共 50 条
  • [1] Improved aquila optimizer with mRMR for feature selection of high-dimensional gene expression data
    Qin, Xiwen
    Zhang, Siqi
    Dong, Xiaogang
    Shi, Hongyu
    Yuan, Liping
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (09): : 13005 - 13027
  • [2] Benchmark of filter methods for feature selection in high-dimensional gene expression survival data
    Bommert, Andrea
    Welchowski, Thomas
    Schmid, Matthias
    Rahnenfuehrer, Joerg
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (01)
  • [3] Feature selection for high-dimensional data
    Destrero A.
    Mosci S.
    De Mol C.
    Verri A.
    Odone F.
    Computational Management Science, 2009, 6 (1) : 25 - 40
  • [4] Feature selection for high-dimensional data
    Bolón-Canedo V.
    Sánchez-Maroño N.
    Alonso-Betanzos A.
    Progress in Artificial Intelligence, 2016, 5 (2) : 65 - 75
  • [5] The feature selection bias problem in relation to high-dimensional gene data
    Krawczuk, Jerzy
    Lukaszuk, Tomasz
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2016, 66 : 63 - 71
  • [6] FEATURE SELECTION FOR HIGH-DIMENSIONAL DATA ANALYSIS
    Verleysen, Michel
    NCTA 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON NEURAL COMPUTATION THEORY AND APPLICATIONS, 2011, : IS23 - IS25
  • [7] Feature selection for high-dimensional imbalanced data
    Yin, Liuzhi
    Ge, Yong
    Xiao, Keli
    Wang, Xuehua
    Quan, Xiaojun
    NEUROCOMPUTING, 2013, 105 : 3 - 11
  • [8] Feature selection for high-dimensional data in astronomy
    Zheng, Hongwen
    Zhang, Yanxia
    ADVANCES IN SPACE RESEARCH, 2008, 41 (12) : 1960 - 1964
  • [9] A filter feature selection for high-dimensional data
    Janane, Fatima Zahra
    Ouaderhman, Tayeb
    Chamlal, Hasna
    JOURNAL OF ALGORITHMS & COMPUTATIONAL TECHNOLOGY, 2023, 17
  • [10] Feature selection for high-dimensional temporal data
    Michail Tsagris
    Vincenzo Lagani
    Ioannis Tsamardinos
    BMC Bioinformatics, 19