Integrating multi-omics data through deep learning for accurate cancer prognosis prediction

被引:102
作者
Chai, Hua [1 ]
Zhou, Xiang [1 ]
Zhang, Zhongyue [1 ]
Rao, Jiahua [1 ]
Zhao, Huiying [2 ]
Yang, Yuedong [1 ,3 ]
机构
[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou 510000, Peoples R China
[2] Sun Yat Sen Univ, Sun Yat Sen Mem Hosp, Guangzhou 510000, Peoples R China
[3] Sun Yat Sen Univ, Key Lab Machine Intelligence & Adv Comp MOE, Guangzhou 510000, Peoples R China
基金
中国国家自然科学基金;
关键词
Survival analysis; Multi-omics; Deep learning; Cancer prognosis; LYMPH-NODE METASTASIS; BREAST-CANCER; SURVIVAL; HETEROGENEITY; ASSOCIATION; ACTIVATION;
D O I
10.1016/j.compbiomed.2021.104481
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Genomic information is nowadays widely used for precise cancer treatments. Since the individual type of omics data only represents a single view that suffers from data noise and bias, multiple types of omics data are required for accurate cancer prognosis prediction. However, it is challenging to effectively integrate multi-omics data due to the large number of redundant variables but relatively small sample size. With the recent progress in deep learning techniques, Autoencoder was used to integrate multi-omics data for extracting representative features. Nevertheless, the generated model is fragile from data noises. Additionally, previous studies usually focused on individual cancer types without making comprehensive tests on pan-cancer. Here, we employed the denoising Autoencoder to get a robust representation of the multi-omics data, and then used the learned representative features to estimate patients' risks. Results: By applying to 15 cancers from The Cancer Genome Atlas (TCGA), our method was shown to improve the C-index values over previous methods by 6.5% on average. Considering the difficulty to obtain multi-omics data in practice, we further used only mRNA data to fit the estimated risks by training XGboost models, and found the models could achieve an average C-index value of 0.627. As a case study, the breast cancer prognosis prediction model was independently tested on three datasets from the Gene Expression Omnibus (GEO), and shown able to significantly separate high-risk patients from low-risk ones (C-index>0.6, p-values<0.05). Based on the risk subgroups divided by our method, we identified nine prognostic markers highly associated with breast cancer, among which seven genes have been proved by literature review. Conclusion: Our comprehensive tests indicated that we have constructed an accurate and robust framework to integrate multi-omics data for cancer prognosis prediction. Moreover, it is an effective way to discover cancer prognosis-related genes.
引用
收藏
页数:8
相关论文
共 36 条
[1]   Towards clinically more relevant dissection of patient heterogeneity via survival-based Bayesian clustering [J].
Ahmad, Ashar ;
Froehlich, Holger .
BIOINFORMATICS, 2017, 33 (22) :3558-3566
[2]  
[Anonymous], ery and Data Mining, DOI DOI 10.1145/2939672.2939785
[3]   Systematic pan-cancer analysis of tumour purity [J].
Aran, Dvir ;
Sirota, Marina ;
Butte, Atul J. .
NATURE COMMUNICATIONS, 2015, 6
[4]   A novel imputation methodology for time series based on pattern sequence forecasting [J].
Bokde, Neeraj ;
Beck, Marcus W. ;
Martinez Alvarez, Francisco ;
Kulat, Kishore .
PATTERN RECOGNITION LETTERS, 2018, 116 :88-96
[5]   CCR7 and CXCR4 as novel biomarkers predicting axillary lymph node metastasis in T1 breast cancer [J].
Cabioglu, N ;
Yazici, MS ;
Arun, B ;
Broglio, KR ;
Hortobagyi, GN ;
Price, JE ;
Sahin, A .
CLINICAL CANCER RESEARCH, 2005, 11 (16) :5686-5693
[6]   Effects of infiltrating lymphocytes and estrogen receptor on gene expression and prognosis in breast cancer [J].
Calabro, Alberto ;
Beissbarth, Tim ;
Kuner, Ruprecht ;
Stojanov, Michael ;
Benner, Axel ;
Asslaber, Martin ;
Ploner, Ferdinand ;
Zatloukal, Kurt ;
Samonigg, Hellmut ;
Poustka, Annemarie ;
Sueltmann, Holger .
BREAST CANCER RESEARCH AND TREATMENT, 2009, 116 (01) :69-77
[7]   Deep Learning-Based Multi-Omics Integration Robustly Predicts Survival in Liver Cancer [J].
Chaudharyl, Kumardeep ;
Poirionl, Olivier B. ;
Lu, Liangqun ;
Garmire, Lana X. .
CLINICAL CANCER RESEARCH, 2018, 24 (06) :1248-1259
[8]   Deep learning with multimodal representation for pancancer prognosis prediction [J].
Cheerla, Anika ;
Gevaert, Olivier .
BIOINFORMATICS, 2019, 35 (14) :I446-I454
[9]   ADIPOQ/adiponectin induces cytotoxic autophagy in breast cancer cells through STK11/LKB1-mediated activation of the AMPK-ULK1 axis [J].
Chung, Seung J. ;
Nagaraju, Ganji Purnachandra ;
Nagalingam, Arumugam ;
Muniraj, Nethaji ;
Kuppusamy, Panjamurthy ;
Walker, Alyssa ;
Woo, Juhyung ;
Gyorffy, Balazs ;
Gabrielson, Ed ;
Saxena, Neeraj K. ;
Sharma, Dipali .
AUTOPHAGY, 2017, 13 (08) :1386-1403
[10]   Robust clustering of noisy high-dimensional gene expression data for patients subtyping [J].
Coretto, Pietro ;
Serra, Angela ;
Tagliaferri, Roberto .
BIOINFORMATICS, 2018, 34 (23) :4064-4072