Integrating multi-omics data through deep learning for accurate cancer prognosis prediction

被引:86
|
作者
Chai, Hua [1 ]
Zhou, Xiang [1 ]
Zhang, Zhongyue [1 ]
Rao, Jiahua [1 ]
Zhao, Huiying [2 ]
Yang, Yuedong [1 ,3 ]
机构
[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou 510000, Peoples R China
[2] Sun Yat Sen Univ, Sun Yat Sen Mem Hosp, Guangzhou 510000, Peoples R China
[3] Sun Yat Sen Univ, Key Lab Machine Intelligence & Adv Comp MOE, Guangzhou 510000, Peoples R China
基金
中国国家自然科学基金;
关键词
Survival analysis; Multi-omics; Deep learning; Cancer prognosis; LYMPH-NODE METASTASIS; BREAST-CANCER; SURVIVAL; HETEROGENEITY; ASSOCIATION; ACTIVATION;
D O I
10.1016/j.compbiomed.2021.104481
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Genomic information is nowadays widely used for precise cancer treatments. Since the individual type of omics data only represents a single view that suffers from data noise and bias, multiple types of omics data are required for accurate cancer prognosis prediction. However, it is challenging to effectively integrate multi-omics data due to the large number of redundant variables but relatively small sample size. With the recent progress in deep learning techniques, Autoencoder was used to integrate multi-omics data for extracting representative features. Nevertheless, the generated model is fragile from data noises. Additionally, previous studies usually focused on individual cancer types without making comprehensive tests on pan-cancer. Here, we employed the denoising Autoencoder to get a robust representation of the multi-omics data, and then used the learned representative features to estimate patients' risks. Results: By applying to 15 cancers from The Cancer Genome Atlas (TCGA), our method was shown to improve the C-index values over previous methods by 6.5% on average. Considering the difficulty to obtain multi-omics data in practice, we further used only mRNA data to fit the estimated risks by training XGboost models, and found the models could achieve an average C-index value of 0.627. As a case study, the breast cancer prognosis prediction model was independently tested on three datasets from the Gene Expression Omnibus (GEO), and shown able to significantly separate high-risk patients from low-risk ones (C-index>0.6, p-values<0.05). Based on the risk subgroups divided by our method, we identified nine prognostic markers highly associated with breast cancer, among which seven genes have been proved by literature review. Conclusion: Our comprehensive tests indicated that we have constructed an accurate and robust framework to integrate multi-omics data for cancer prognosis prediction. Moreover, it is an effective way to discover cancer prognosis-related genes.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Redefining cancer subtypes using multi-omics and deep learning.
    Akalin, Altuna
    Uyar, Bora
    Ronen, Jonathan
    Franke, Vedran
    CANCER RESEARCH, 2021, 81 (13)
  • [42] On a novel statistical method for integrating multi-omics data
    Das, Sarmistha
    Mukhopadhyay, Indranil
    GENETIC EPIDEMIOLOGY, 2020, 44 (05) : 506 - 506
  • [43] Approaches to Integrating Metabolomics and Multi-Omics Data: A Primer
    Jendoubi, Takoua
    METABOLITES, 2021, 11 (03)
  • [44] A deep learning-based framework for predicting survival-associated groups in colon cancer by integrating multi-omics and clinical data
    Salimy, Siamak
    Lanjanian, Hossein
    Abbasi, Karim
    Salimi, Mahdieh
    Najafi, Ali
    Tapak, Leili
    Masoudi-Nejad, Ali
    HELIYON, 2023, 9 (07)
  • [45] Deep learning and multi-omics approach to predict drug responses in cancer
    Conghao Wang
    Xintong Lye
    Rama Kaalia
    Parvin Kumar
    Jagath C. Rajapakse
    BMC Bioinformatics, 22
  • [46] Deep learning based multi-omics model for prediction of outcomes in HFpEF and HFmrEF
    Fisch, Sudeshna
    Jha, Alokkumar
    CIRCULATION RESEARCH, 2024, 135
  • [47] Deep learning and multi-omics approach to predict drug responses in cancer
    Wang, Conghao
    Lye, Xintong
    Kaalia, Rama
    Kumar, Parvin
    Rajapakse, Jagath C.
    BMC BIOINFORMATICS, 2022, 22 (SUPPL 10)
  • [48] MetaCancer: A deep learning-based pan-cancer metastasis prediction model developed using multi-omics data
    Albaradei, Somayah
    Napolitano, Francesco
    Thafar, Maha A.
    Gojobori, Takashi
    Essack, Magbubah
    Gao, Xin
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2021, 19 : 4404 - 4411
  • [49] Advanced segmentation method for integrating multi-omics data for early cancer detection
    Sangeetha, S. K. B.
    Mathivanan, Sandeep Kumar
    Azath, M.
    Beniwal, Ravinder
    Ahmad, Naim
    Ghribi, Wade
    Mallik, Saurav
    EGYPTIAN INFORMATICS JOURNAL, 2025, 29
  • [50] MCNF: A Novel Method for Cancer Subtyping by Integrating Multi-Omics and Clinical Data
    Zhao, Lan
    Yan, Hong
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 17 (05) : 1682 - 1690