Integrating multi-omics data through deep learning for accurate cancer prognosis prediction

被引:86
作者
Chai, Hua [1 ]
Zhou, Xiang [1 ]
Zhang, Zhongyue [1 ]
Rao, Jiahua [1 ]
Zhao, Huiying [2 ]
Yang, Yuedong [1 ,3 ]
机构
[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou 510000, Peoples R China
[2] Sun Yat Sen Univ, Sun Yat Sen Mem Hosp, Guangzhou 510000, Peoples R China
[3] Sun Yat Sen Univ, Key Lab Machine Intelligence & Adv Comp MOE, Guangzhou 510000, Peoples R China
基金
中国国家自然科学基金;
关键词
Survival analysis; Multi-omics; Deep learning; Cancer prognosis; LYMPH-NODE METASTASIS; BREAST-CANCER; SURVIVAL; HETEROGENEITY; ASSOCIATION; ACTIVATION;
D O I
10.1016/j.compbiomed.2021.104481
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Genomic information is nowadays widely used for precise cancer treatments. Since the individual type of omics data only represents a single view that suffers from data noise and bias, multiple types of omics data are required for accurate cancer prognosis prediction. However, it is challenging to effectively integrate multi-omics data due to the large number of redundant variables but relatively small sample size. With the recent progress in deep learning techniques, Autoencoder was used to integrate multi-omics data for extracting representative features. Nevertheless, the generated model is fragile from data noises. Additionally, previous studies usually focused on individual cancer types without making comprehensive tests on pan-cancer. Here, we employed the denoising Autoencoder to get a robust representation of the multi-omics data, and then used the learned representative features to estimate patients' risks. Results: By applying to 15 cancers from The Cancer Genome Atlas (TCGA), our method was shown to improve the C-index values over previous methods by 6.5% on average. Considering the difficulty to obtain multi-omics data in practice, we further used only mRNA data to fit the estimated risks by training XGboost models, and found the models could achieve an average C-index value of 0.627. As a case study, the breast cancer prognosis prediction model was independently tested on three datasets from the Gene Expression Omnibus (GEO), and shown able to significantly separate high-risk patients from low-risk ones (C-index>0.6, p-values<0.05). Based on the risk subgroups divided by our method, we identified nine prognostic markers highly associated with breast cancer, among which seven genes have been proved by literature review. Conclusion: Our comprehensive tests indicated that we have constructed an accurate and robust framework to integrate multi-omics data for cancer prognosis prediction. Moreover, it is an effective way to discover cancer prognosis-related genes.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Multimodal deep learning approaches for single-cell multi-omics data integration
    Athaya, Tasbiraha
    Ripan, Rony Chowdhury
    Li, Xiaoman
    Hu, Haiyan
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (05)
  • [42] Performance Comparison of Deep Learning Autoencoders for Cancer Subtype Detection Using Multi-Omics Data
    Franco, Edian F.
    Rana, Pratip
    Cruz, Aline
    Calderon, Victor V.
    Azevedo, Vasco
    Ramos, Rommel T. J.
    Ghosh, Preetam
    CANCERS, 2021, 13 (09)
  • [43] Diagnostic Classification of Lung Cancer Using Deep Transfer Learning Technology and Multi-Omics Data
    Rong, Z. H. U.
    Lingyun, D. A., I
    Jinxing, L. I. U.
    Ying, G. U. O.
    CHINESE JOURNAL OF ELECTRONICS, 2021, 30 (05) : 843 - 852
  • [44] Robust Prognostic Subtyping of Muscle-Invasive Bladder Cancer Revealed by Deep Learning-Based Multi-Omics Data Integration
    Zhang, Xiaolong
    Wang, Jiayin
    Lu, Jiabin
    Su, Lili
    Wang, Changxi
    Huang, Yuhua
    Zhang, Xuanping
    Zhu, Xiaoyan
    FRONTIERS IN ONCOLOGY, 2021, 11
  • [45] Incorporating deep learning and multi-omics autoencoding for analysis of lung adenocarcinoma prognostication
    Lee, Tzong-Yi
    Huang, Kai-Yao
    Chuang, Cheng-Hsiang
    Lee, Cheng-Yang
    Chang, Tzu-Hao
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2020, 87 (87)
  • [46] Approaches to Integrating Metabolomics and Multi-Omics Data: A Primer
    Jendoubi, Takoua
    METABOLITES, 2021, 11 (03)
  • [47] MCNF: A Novel Method for Cancer Subtyping by Integrating Multi-Omics and Clinical Data
    Zhao, Lan
    Yan, Hong
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 17 (05) : 1682 - 1690
  • [48] Advanced segmentation method for integrating multi-omics data for early cancer detection
    Sangeetha, S. K. B.
    Mathivanan, Sandeep Kumar
    Azath, M.
    Beniwal, Ravinder
    Ahmad, Naim
    Ghribi, Wade
    Mallik, Saurav
    EGYPTIAN INFORMATICS JOURNAL, 2025, 29
  • [49] DeePROG: Deep Attention-Based Model for Diseased Gene Prognosis by Fusing Multi-Omics Data
    Dutta, Pratik
    Patra, Aditya Prakash
    Saha, Sriparna
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (05) : 2770 - 2781
  • [50] Deep learning assisted multi-omics integration for survival and drug-response prediction in breast cancer
    Vidhi Malik
    Yogesh Kalakoti
    Durai Sundar
    BMC Genomics, 22