Intrinsic-dimension analysis for guiding dimensionality reduction and data fusion in multi-omics data processing

被引:0
|
作者
Gliozzo, Jessica [1 ,2 ]
Soto-Gomez, Mauricio [1 ]
Guarino, Valentina [1 ]
Bonometti, Arturo [3 ,4 ]
Cabri, Alberto [1 ]
Cavalleri, Emanuele [1 ]
Reese, Justin [5 ]
Robinson, Peter N. [6 ]
Mesiti, Marco [1 ,5 ]
Valentini, Giorgio [1 ,7 ]
Casiraghi, Elena [1 ,5 ,7 ,8 ]
机构
[1] Univ Studi Milano, Comp Sci Dept, AnacletoLab, Milan, Italy
[2] European Commiss, Joint Res Ctr JRC, Ispra, Italy
[3] Humanitas Univ, Dept Biomed Sci, Milan, Italy
[4] IRCCS Humanitas Clin & Res Hosp, Dept Pathol, Milan, Italy
[5] Lawrence Berkeley Natl Lab, Environm Genom & Syst Biol Div, Berkeley, CA USA
[6] Jackson Lab Genom Med, Farmington, CT USA
[7] Infolife Natl Lab, CINI, Rome, Italy
[8] Aalto Univ, Dept Comp Sci, Espoo, Finland
关键词
Dimensionality reduction; Intrinsic dimensionality; Feature selection; Feature extraction; Data fusion; Multi-omics datasets; PREDICTION;
D O I
10.1016/j.artmed.2024.103049
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-omics data have revolutionized biomedical research by providing a comprehensive understanding of biological systems and the molecular mechanisms of disease development. However, analyzing multi-omics data is challenging due to high dimensionality and limited sample sizes, necessitating proper data-reduction pipelines to ensure reliable analyses. Additionally, its multimodal nature requires effective data-integration pipelines. While several dimensionality reduction and data fusion algorithms have been proposed, crucial aspects are often overlooked. Specifically, the choice of projection space dimension is typically heuristic and uniformly applied across all omics, neglecting the unique high dimension small sample size challenges faced by individual omics. This paper introduces a novel multi-modal dimensionality reduction pipeline tailored to individual views. By leveraging intrinsic dimensionality estimators, we assess the curse-of-dimensionality impact on each view and propose a two-step reduction strategy for significantly affected views, combining feature selection with feature extraction. Compared to traditional uniform reduction pipelines in a crucial and supervised multi-omics analysis setting, our approach shows significant improvement. Additionally, we explore three effective unsupervised multi-omics data fusion methods rooted in the main data fusion strategies to gain insights into their performance under crucial, yet overlooked, settings.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Sliced inverse regression for integrative multi-omics data analysis
    Jain, Yashita
    Ding, Shanshan
    Qiu, Jing
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2019, 18 (01)
  • [32] MODMatcher: Multi-Omics Data Matcher for Integrative Genomic Analysis
    Yoo, Seungyeul
    Huang, Tao
    Campbell, Joshua D.
    Lee, Eunjee
    Tu, Zhidong
    Geraci, Mark W.
    Powell, Charles A.
    Schadt, Eric E.
    Spira, Avrum
    Zhu, Jun
    PLOS COMPUTATIONAL BIOLOGY, 2014, 10 (08)
  • [33] Editorial: Advances in methods and tools for multi-omics data analysis
    Cominetti, Ornella
    Agarwal, Sumeet
    Oller-Moreno, Sergio
    FRONTIERS IN MOLECULAR BIOSCIENCES, 2023, 10
  • [34] Network analysis with multi-omics data using graphical LASSO
    Park, Jaehyun
    Won, Sungho
    GENETIC EPIDEMIOLOGY, 2020, 44 (05) : 509 - 509
  • [35] MOMIC: A Multi-Omics Pipeline for Data Analysis, Integration and Interpretation
    Madrid-Marquez, Laura
    Rubio-Escudero, Cristina
    Pontes, Beatriz
    Gonzalez-Perez, Antonio
    Riquelme, Jose C.
    Saez, Maria E.
    APPLIED SCIENCES-BASEL, 2022, 12 (08):
  • [36] Integration strategies of multi-omics data for machine learning analysis
    Picard, Milan
    Scott-Boyer, Marie -Pier
    Bodein, Antoine
    Perin, Olivier
    Droit, Arnaud
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2021, 19 : 3735 - 3746
  • [37] Directional integration and pathway enrichment analysis for multi-omics data
    Slobodyanyuk, Mykhaylo
    Bahcheli, Alexander T.
    Klein, Zoe P.
    Bayati, Masroor
    Strug, Lisa J.
    Reimand, Juri
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [38] Integrating FAIR Experimental Metadata for Multi-omics Data Analysis
    Doniparthi, Gajendra
    Mühlhaus, Timo
    Deßloch, Stefan
    Datenbank-Spektrum, 2024, 24 (02) : 107 - 115
  • [39] Deep latent space fusion for adaptive representation of heterogeneous multi-omics data
    Zhang, Chengming
    Chen, Yabin
    Zeng, Tao
    Zhang, Chuanchao
    Chen, Luonan
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (02)
  • [40] The Omics Dashboard for Interactive Exploration of Metabolomics and Multi-Omics Data
    Paley, Suzanne
    Karp, Peter D.
    METABOLITES, 2024, 14 (01)