Integrating multi-omics data through deep learning for accurate cancer prognosis prediction

被引:86
|
作者
Chai, Hua [1 ]
Zhou, Xiang [1 ]
Zhang, Zhongyue [1 ]
Rao, Jiahua [1 ]
Zhao, Huiying [2 ]
Yang, Yuedong [1 ,3 ]
机构
[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou 510000, Peoples R China
[2] Sun Yat Sen Univ, Sun Yat Sen Mem Hosp, Guangzhou 510000, Peoples R China
[3] Sun Yat Sen Univ, Key Lab Machine Intelligence & Adv Comp MOE, Guangzhou 510000, Peoples R China
基金
中国国家自然科学基金;
关键词
Survival analysis; Multi-omics; Deep learning; Cancer prognosis; LYMPH-NODE METASTASIS; BREAST-CANCER; SURVIVAL; HETEROGENEITY; ASSOCIATION; ACTIVATION;
D O I
10.1016/j.compbiomed.2021.104481
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Genomic information is nowadays widely used for precise cancer treatments. Since the individual type of omics data only represents a single view that suffers from data noise and bias, multiple types of omics data are required for accurate cancer prognosis prediction. However, it is challenging to effectively integrate multi-omics data due to the large number of redundant variables but relatively small sample size. With the recent progress in deep learning techniques, Autoencoder was used to integrate multi-omics data for extracting representative features. Nevertheless, the generated model is fragile from data noises. Additionally, previous studies usually focused on individual cancer types without making comprehensive tests on pan-cancer. Here, we employed the denoising Autoencoder to get a robust representation of the multi-omics data, and then used the learned representative features to estimate patients' risks. Results: By applying to 15 cancers from The Cancer Genome Atlas (TCGA), our method was shown to improve the C-index values over previous methods by 6.5% on average. Considering the difficulty to obtain multi-omics data in practice, we further used only mRNA data to fit the estimated risks by training XGboost models, and found the models could achieve an average C-index value of 0.627. As a case study, the breast cancer prognosis prediction model was independently tested on three datasets from the Gene Expression Omnibus (GEO), and shown able to significantly separate high-risk patients from low-risk ones (C-index>0.6, p-values<0.05). Based on the risk subgroups divided by our method, we identified nine prognostic markers highly associated with breast cancer, among which seven genes have been proved by literature review. Conclusion: Our comprehensive tests indicated that we have constructed an accurate and robust framework to integrate multi-omics data for cancer prognosis prediction. Moreover, it is an effective way to discover cancer prognosis-related genes.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Enhancing Lung Cancer Classification and Prediction With Deep Learning and Multi-Omics Data
    Mohamed, Tehnan I. A.
    Ezugwu, Absalom El-Shamir
    IEEE ACCESS, 2024, 12 : 59880 - 59892
  • [2] MOFNet: A Deep Learning Framework of Integrating Multi-omics Data for Breast Cancer Diagnosis
    Zhang, Chunxiao
    Li, Pengpai
    Sun, Duanchen
    Liu, Zhi-Ping
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT III, 2023, 14088 : 727 - 738
  • [3] TLSurv: Integrating Multi-Omics Data by Multi-Stage Transfer Learning for Cancer Survival Prediction
    Jiang, Yixing
    Alford, Kristen
    Ketchum, Frank
    Tong, Li
    Wang, May D.
    ACM-BCB 2020 - 11TH ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2020,
  • [4] A review of multi-omics data integration through deep learning approaches for disease diagnosis, prognosis, and treatment
    Wekesa, Jael Sanyanda
    Kimwele, Michael
    FRONTIERS IN GENETICS, 2023, 14
  • [5] Prediction of survival and recurrence in patients with pancreatic cancer by integrating multi-omics data
    Bin Baek
    Hyunju Lee
    Scientific Reports, 10
  • [6] A Contrastive-Learning-Based Deep Neural Network for Cancer Subtyping by Integrating Multi-Omics Data
    Chai, Hua
    Deng, Weizhen
    Wei, Junyu
    Guan, Ting
    He, Minfan
    Liang, Yong
    Li, Le
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2024, 16 (04) : 966 - 975
  • [7] Prediction of survival and recurrence in patients with pancreatic cancer by integrating multi-omics data
    Baek, Bin
    Lee, Hyunju
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [8] Integrating multi-omics data by learning modality invariant representations for improved prediction of overall survival of cancer
    Tong, Li
    Wu, Hang
    Wang, May D.
    METHODS, 2021, 189 : 74 - 85
  • [9] A Deep Learning Fusion Clustering framework for breast cancer subtypes identification by integrating multi-omics data
    Liu Shuangshuang
    Qi Lin
    Tie Yun
    Liu Fenghui
    2020 5TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2020), 2020, : 1710 - 1714
  • [10] DeepProg: an ensemble of deep-learning and machine-learning models for prognosis prediction using multi-omics data
    Poirion, Olivier B.
    Jing, Zheng
    Chaudhary, Kumardeep
    Huang, Sijia
    Garmire, Lana X.
    GENOME MEDICINE, 2021, 13 (01)