Integrating multi-omics data through deep learning for accurate cancer prognosis prediction

被引:86
|
作者
Chai, Hua [1 ]
Zhou, Xiang [1 ]
Zhang, Zhongyue [1 ]
Rao, Jiahua [1 ]
Zhao, Huiying [2 ]
Yang, Yuedong [1 ,3 ]
机构
[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou 510000, Peoples R China
[2] Sun Yat Sen Univ, Sun Yat Sen Mem Hosp, Guangzhou 510000, Peoples R China
[3] Sun Yat Sen Univ, Key Lab Machine Intelligence & Adv Comp MOE, Guangzhou 510000, Peoples R China
基金
中国国家自然科学基金;
关键词
Survival analysis; Multi-omics; Deep learning; Cancer prognosis; LYMPH-NODE METASTASIS; BREAST-CANCER; SURVIVAL; HETEROGENEITY; ASSOCIATION; ACTIVATION;
D O I
10.1016/j.compbiomed.2021.104481
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Genomic information is nowadays widely used for precise cancer treatments. Since the individual type of omics data only represents a single view that suffers from data noise and bias, multiple types of omics data are required for accurate cancer prognosis prediction. However, it is challenging to effectively integrate multi-omics data due to the large number of redundant variables but relatively small sample size. With the recent progress in deep learning techniques, Autoencoder was used to integrate multi-omics data for extracting representative features. Nevertheless, the generated model is fragile from data noises. Additionally, previous studies usually focused on individual cancer types without making comprehensive tests on pan-cancer. Here, we employed the denoising Autoencoder to get a robust representation of the multi-omics data, and then used the learned representative features to estimate patients' risks. Results: By applying to 15 cancers from The Cancer Genome Atlas (TCGA), our method was shown to improve the C-index values over previous methods by 6.5% on average. Considering the difficulty to obtain multi-omics data in practice, we further used only mRNA data to fit the estimated risks by training XGboost models, and found the models could achieve an average C-index value of 0.627. As a case study, the breast cancer prognosis prediction model was independently tested on three datasets from the Gene Expression Omnibus (GEO), and shown able to significantly separate high-risk patients from low-risk ones (C-index>0.6, p-values<0.05). Based on the risk subgroups divided by our method, we identified nine prognostic markers highly associated with breast cancer, among which seven genes have been proved by literature review. Conclusion: Our comprehensive tests indicated that we have constructed an accurate and robust framework to integrate multi-omics data for cancer prognosis prediction. Moreover, it is an effective way to discover cancer prognosis-related genes.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] DeepProg: an ensemble of deep-learning and machine-learning models for prognosis prediction using multi-omics data
    Olivier B. Poirion
    Zheng Jing
    Kumardeep Chaudhary
    Sijia Huang
    Lana X. Garmire
    Genome Medicine, 13
  • [2] DeepProg: an ensemble of deep-learning and machine-learning models for prognosis prediction using multi-omics data
    Poirion, Olivier B.
    Jing, Zheng
    Chaudhary, Kumardeep
    Huang, Sijia
    Garmire, Lana X.
    GENOME MEDICINE, 2021, 13 (01)
  • [3] Enhancing Lung Cancer Classification and Prediction With Deep Learning and Multi-Omics Data
    Mohamed, Tehnan I. A.
    Ezugwu, Absalom El-Shamir
    IEEE ACCESS, 2024, 12 : 59880 - 59892
  • [4] MOFNet: A Deep Learning Framework of Integrating Multi-omics Data for Breast Cancer Diagnosis
    Zhang, Chunxiao
    Li, Pengpai
    Sun, Duanchen
    Liu, Zhi-Ping
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT III, 2023, 14088 : 727 - 738
  • [5] A review of multi-omics data integration through deep learning approaches for disease diagnosis, prognosis, and treatment
    Wekesa, Jael Sanyanda
    Kimwele, Michael
    FRONTIERS IN GENETICS, 2023, 14
  • [6] Group Lasso Regularized Deep Learning for Cancer Prognosis from Multi-Omics and Clinical Features
    Xie, Gangcai
    Dong, Chengliang
    Kong, Yinfei
    Zhong, Jiang E.
    Li, Mingyao
    Wang, Kai
    GENES, 2019, 10 (03)
  • [7] Predicting bladder cancer prognosis by integrating multi-omics data through a transfer learning-based Cox proportional hazards network
    Chai, Hua
    Zhang, Zhongyue
    Wang, Yi
    Yang, Yuedong
    CCF TRANSACTIONS ON HIGH PERFORMANCE COMPUTING, 2021, 3 (03) : 311 - 319
  • [8] Predicting bladder cancer prognosis by integrating multi-omics data through a transfer learning-based Cox proportional hazards network
    Hua Chai
    Zhongyue Zhang
    Yi Wang
    Yuedong Yang
    CCF Transactions on High Performance Computing, 2021, 3 : 311 - 319
  • [9] Deep learning-based ovarian cancer subtypes identification using multi-omics data
    Guo, Long-Yi
    Wu, Ai-Hua
    Wang, Yong-xia
    Zhang, Li-ping
    Chai, Hua
    Liang, Xue-Fang
    BIODATA MINING, 2020, 13 (01)
  • [10] A Deep Learning Fusion Clustering framework for breast cancer subtypes identification by integrating multi-omics data
    Liu Shuangshuang
    Qi Lin
    Tie Yun
    Liu Fenghui
    2020 5TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2020), 2020, : 1710 - 1714