Stability of Feature Selection in Multi-Omics Data Analysis

被引:0
|
作者
Lukaszuk, Tomasz [1 ]
Krawczuk, Jerzy [1 ]
Zyla, Kamil [2 ]
Kesik, Jacek [2 ]
机构
[1] Bialystok Tech Univ, Fac Comp Sci, Wiejska 45A, PL-15351 Bialystok, Poland
[2] Lublin Univ Technol, Fac Elect Engn & Comp Sci, Dept Comp Sci, Nadbystrzycka 36B, PL-20618 Lublin, Poland
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 23期
关键词
multi-omics; high-dimensional data; cancer genomics; feature selection; stability; L1; regularization; CLASSIFICATION; ALGORITHMS;
D O I
10.3390/app142311103
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In the rapidly evolving field of multi-omics data analysis, understanding the stability of feature selection is critical for reliable biomarker discovery and clinical applications. This study investigates the stability of feature-selection methods across various cancer types by utilizing 15 datasets from The Cancer Genome Atlas (TCGA). We employed classifiers with embedded feature selection, including Support Vector Machines (SVM), Logistic Regression (LR), and Lasso regression, each incorporating L1 regularization. Through a comprehensive evaluation using five-fold cross-validation, we measured feature-selection stability and assessed the accuracy of predictions regarding TP53 mutations, a known indicator of poor clinical outcomes in cancer patients. All three classifiers demonstrated optimal feature-selection stability, measured by the Nogueira metric, with higher regularization (fewer selected features), while lower regularization generally resulted in decreased stability across all omics layers. Our findings indicate differences in feature stability across the various omics layers; mirna consistently exhibited the highest stability across classifiers, while the mutation and rna layers were generally less stable, particularly with lower regularization. This work highlights the importance of careful feature selection and validation in high-dimensional datasets to enhance the robustness and reliability of multi-omics analyses.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Correction: Integrative analysis of multi-omics data for liquid biopsy
    Geng Chen
    Jing Zhang
    Qiaoting Fu
    Valerie Taly
    Fei Tan
    British Journal of Cancer, 2023, 128 : 702 - 702
  • [22] Evaluation of integrative clustering methods for the analysis of multi-omics data
    Chauvel, Cecile
    Novoloaca, Alexei
    Veyre, Pierre
    Reynier, Frederic
    Becker, Jeremie
    BRIEFINGS IN BIOINFORMATICS, 2020, 21 (02) : 541 - 552
  • [23] Computational Analysis of Phosphoproteomics Data in Multi-Omics Cancer Studies
    Mantini, Giulia
    Pham, Thang, V
    Piersma, Sander R.
    Jimenez, Connie R.
    PROTEOMICS, 2021, 21 (3-4)
  • [24] Comparative analysis of integrative classification methods for multi-omics data
    Novoloaca, Alexei
    Broc, Camilo
    Beloeil, Laurent
    Yu, Wen-Han
    Becker, Jeremie
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (04)
  • [25] Integration strategies of multi-omics data for machine learning analysis
    Picard M.
    Scott-Boyer M.-P.
    Bodein A.
    Périn O.
    Droit A.
    Computational and Structural Biotechnology Journal, 2021, 19 : 3735 - 3746
  • [26] Sliced inverse regression for integrative multi-omics data analysis
    Jain, Yashita
    Ding, Shanshan
    Qiu, Jing
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2019, 18 (01)
  • [27] Strategic Multi-Omics Data Integration via Multi-Level Feature Contrasting and Matching
    Zhang, Jinli
    Ren, Hongwei
    Jiang, Zongli
    Chen, Zheng
    Yang, Ziwei
    Matsubara, Yasuko
    Sakurai, Yasushi
    IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2024, 23 (04) : 579 - 590
  • [28] MODMatcher: Multi-Omics Data Matcher for Integrative Genomic Analysis
    Yoo, Seungyeul
    Huang, Tao
    Campbell, Joshua D.
    Lee, Eunjee
    Tu, Zhidong
    Geraci, Mark W.
    Powell, Charles A.
    Schadt, Eric E.
    Spira, Avrum
    Zhu, Jun
    PLOS COMPUTATIONAL BIOLOGY, 2014, 10 (08)
  • [29] Editorial: Advances in methods and tools for multi-omics data analysis
    Cominetti, Ornella
    Agarwal, Sumeet
    Oller-Moreno, Sergio
    FRONTIERS IN MOLECULAR BIOSCIENCES, 2023, 10
  • [30] MOMIC: A Multi-Omics Pipeline for Data Analysis, Integration and Interpretation
    Madrid-Marquez, Laura
    Rubio-Escudero, Cristina
    Pontes, Beatriz
    Gonzalez-Perez, Antonio
    Riquelme, Jose C.
    Saez, Maria E.
    APPLIED SCIENCES-BASEL, 2022, 12 (08):