Stability of Feature Selection in Multi-Omics Data Analysis

被引:0
|
作者
Lukaszuk, Tomasz [1 ]
Krawczuk, Jerzy [1 ]
Zyla, Kamil [2 ]
Kesik, Jacek [2 ]
机构
[1] Bialystok Tech Univ, Fac Comp Sci, Wiejska 45A, PL-15351 Bialystok, Poland
[2] Lublin Univ Technol, Fac Elect Engn & Comp Sci, Dept Comp Sci, Nadbystrzycka 36B, PL-20618 Lublin, Poland
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 23期
关键词
multi-omics; high-dimensional data; cancer genomics; feature selection; stability; L1; regularization; CLASSIFICATION; ALGORITHMS;
D O I
10.3390/app142311103
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In the rapidly evolving field of multi-omics data analysis, understanding the stability of feature selection is critical for reliable biomarker discovery and clinical applications. This study investigates the stability of feature-selection methods across various cancer types by utilizing 15 datasets from The Cancer Genome Atlas (TCGA). We employed classifiers with embedded feature selection, including Support Vector Machines (SVM), Logistic Regression (LR), and Lasso regression, each incorporating L1 regularization. Through a comprehensive evaluation using five-fold cross-validation, we measured feature-selection stability and assessed the accuracy of predictions regarding TP53 mutations, a known indicator of poor clinical outcomes in cancer patients. All three classifiers demonstrated optimal feature-selection stability, measured by the Nogueira metric, with higher regularization (fewer selected features), while lower regularization generally resulted in decreased stability across all omics layers. Our findings indicate differences in feature stability across the various omics layers; mirna consistently exhibited the highest stability across classifiers, while the mutation and rna layers were generally less stable, particularly with lower regularization. This work highlights the importance of careful feature selection and validation in high-dimensional datasets to enhance the robustness and reliability of multi-omics analyses.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Benchmark study of feature selection strategies for multi-omics data
    Yingxia Li
    Ulrich Mansmann
    Shangming Du
    Roman Hornung
    BMC Bioinformatics, 23
  • [2] Benchmark study of feature selection strategies for multi-omics data
    Li, Yingxia
    Mansmann, Ulrich
    Du, Shangming
    Hornung, Roman
    BMC BIOINFORMATICS, 2022, 23 (01)
  • [3] Survey on Multi-omics, and Multi-omics Data Analysis, Integration and Application
    Shahrajabian, Mohamad Hesam
    Sun, Wenli
    CURRENT PHARMACEUTICAL ANALYSIS, 2023, 19 (04) : 267 - 281
  • [4] Novel feature selection method via kernel tensor decomposition for improved multi-omics data analysis
    Taguchi, Y-H
    Turki, Turki
    BMC MEDICAL GENOMICS, 2022, 15 (01)
  • [5] Novel feature selection method via kernel tensor decomposition for improved multi-omics data analysis
    Y-h. Taguchi
    Turki Turki
    BMC Medical Genomics, 15
  • [6] Classifying the multi-omics data of gastric cancer using a deep feature selection method
    Hu, Yanyu
    Zhao, Long
    Li, Zhao
    Dong, Xiangjun
    Xu, Tiantian
    Zhao, Yuhai
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 200
  • [7] Visual analysis of multi-omics data
    Swart, Austin
    Caspi, Ron
    Paley, Suzanne
    Karp, Peter D.
    FRONTIERS IN BIOINFORMATICS, 2024, 4
  • [8] MAINE: a web tool for multi-omics feature selection and rule-based data exploration
    Gruca, Aleksandra
    Henzel, Joanna
    Kostorz, Iwona
    Steclik, Tomasz
    Wrobel, Lukasz
    Sikora, Marek
    BIOINFORMATICS, 2022, 38 (06) : 1773 - 1775
  • [9] MSPL: Multimodal Self-Paced Learning for Multi-Omics Feature Selection and Data Integration
    Yang, Zi-Yi
    Xia, Liang-Yong
    Zhang, Hui
    Liang, Yong
    IEEE ACCESS, 2019, 7 : 170513 - 170524
  • [10] Multi-Omics Factor Analysis-a framework for unsupervised integration of multi-omics data sets
    Argelaguet, Ricard
    Velten, Britta
    Arnol, Damien
    Dietrich, Sascha
    Zenz, Thorsten
    Marioni, John C.
    Buettner, Florian
    Huber, Wolfgang
    Stegle, Oliver
    MOLECULAR SYSTEMS BIOLOGY, 2018, 14 (06)