Integrating multi-omics data through deep learning for accurate cancer prognosis prediction

被引：102

作者：

Chai, Hua ^{[1
]}

Zhou, Xiang ^{[1
]}

Zhang, Zhongyue ^{[1
]}

Rao, Jiahua ^{[1
]}

Zhao, Huiying ^{[2
]}

Yang, Yuedong ^{[1
,3
]}

机构：

[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou 510000, Peoples R China

[2] Sun Yat Sen Univ, Sun Yat Sen Mem Hosp, Guangzhou 510000, Peoples R China

[3] Sun Yat Sen Univ, Key Lab Machine Intelligence & Adv Comp MOE, Guangzhou 510000, Peoples R China

来源：

COMPUTERS IN BIOLOGY AND MEDICINE | 2021年 / 134卷

基金：

中国国家自然科学基金;

关键词：

Survival analysis; Multi-omics; Deep learning; Cancer prognosis; LYMPH-NODE METASTASIS; BREAST-CANCER; SURVIVAL; HETEROGENEITY; ASSOCIATION; ACTIVATION;

D O I：

10.1016/j.compbiomed.2021.104481

中图分类号：

Q [生物科学];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Background: Genomic information is nowadays widely used for precise cancer treatments. Since the individual type of omics data only represents a single view that suffers from data noise and bias, multiple types of omics data are required for accurate cancer prognosis prediction. However, it is challenging to effectively integrate multi-omics data due to the large number of redundant variables but relatively small sample size. With the recent progress in deep learning techniques, Autoencoder was used to integrate multi-omics data for extracting representative features. Nevertheless, the generated model is fragile from data noises. Additionally, previous studies usually focused on individual cancer types without making comprehensive tests on pan-cancer. Here, we employed the denoising Autoencoder to get a robust representation of the multi-omics data, and then used the learned representative features to estimate patients' risks. Results: By applying to 15 cancers from The Cancer Genome Atlas (TCGA), our method was shown to improve the C-index values over previous methods by 6.5% on average. Considering the difficulty to obtain multi-omics data in practice, we further used only mRNA data to fit the estimated risks by training XGboost models, and found the models could achieve an average C-index value of 0.627. As a case study, the breast cancer prognosis prediction model was independently tested on three datasets from the Gene Expression Omnibus (GEO), and shown able to significantly separate high-risk patients from low-risk ones (C-index>0.6, p-values<0.05). Based on the risk subgroups divided by our method, we identified nine prognostic markers highly associated with breast cancer, among which seven genes have been proved by literature review. Conclusion: Our comprehensive tests indicated that we have constructed an accurate and robust framework to integrate multi-omics data for cancer prognosis prediction. Moreover, it is an effective way to discover cancer prognosis-related genes.

引用

页数：8

共 36 条

[1] Towards clinically more relevant dissection of patient heterogeneity via survival-based Bayesian clustering [J].

Ahmad, Ashar ;

Froehlich, Holger .

BIOINFORMATICS, 2017, 33 (22) :3558-3566

[2]

[Anonymous], ery and Data Mining, DOI DOI 10.1145/2939672.2939785

[3] Systematic pan-cancer analysis of tumour purity [J].

Aran, Dvir ;

Sirota, Marina ;

Butte, Atul J. .

NATURE COMMUNICATIONS, 2015, 6

[4] A novel imputation methodology for time series based on pattern sequence forecasting [J].

Bokde, Neeraj ;

Beck, Marcus W. ;

Martinez Alvarez, Francisco ;

Kulat, Kishore .

PATTERN RECOGNITION LETTERS, 2018, 116 :88-96

[5] CCR7 and CXCR4 as novel biomarkers predicting axillary lymph node metastasis in T1 breast cancer [J].

Cabioglu, N ;

Yazici, MS ;

Arun, B ;

Broglio, KR ;

Hortobagyi, GN ;

Price, JE ;

Sahin, A .

CLINICAL CANCER RESEARCH, 2005, 11 (16) :5686-5693

[6] Effects of infiltrating lymphocytes and estrogen receptor on gene expression and prognosis in breast cancer [J].

Calabro, Alberto ;

Beissbarth, Tim ;

Kuner, Ruprecht ;

Stojanov, Michael ;

Benner, Axel ;

Asslaber, Martin ;

Ploner, Ferdinand ;

Zatloukal, Kurt ;

Samonigg, Hellmut ;

Poustka, Annemarie ;

Sueltmann, Holger .

BREAST CANCER RESEARCH AND TREATMENT, 2009, 116 (01) :69-77

[7] Deep Learning-Based Multi-Omics Integration Robustly Predicts Survival in Liver Cancer [J].

Chaudharyl, Kumardeep ;

Poirionl, Olivier B. ;

Lu, Liangqun ;

Garmire, Lana X. .

CLINICAL CANCER RESEARCH, 2018, 24 (06) :1248-1259

[8] Deep learning with multimodal representation for pancancer prognosis prediction [J].

Cheerla, Anika ;

Gevaert, Olivier .

BIOINFORMATICS, 2019, 35 (14) :I446-I454

[9] ADIPOQ/adiponectin induces cytotoxic autophagy in breast cancer cells through STK11/LKB1-mediated activation of the AMPK-ULK1 axis [J].

Chung, Seung J. ;

Nagaraju, Ganji Purnachandra ;

Nagalingam, Arumugam ;

Muniraj, Nethaji ;

Kuppusamy, Panjamurthy ;

Walker, Alyssa ;

Woo, Juhyung ;

Gyorffy, Balazs ;

Gabrielson, Ed ;

Saxena, Neeraj K. ;

Sharma, Dipali .

AUTOPHAGY, 2017, 13 (08) :1386-1403

[10] Robust clustering of noisy high-dimensional gene expression data for patients subtyping [J].

Coretto, Pietro ;

Serra, Angela ;

Tagliaferri, Roberto .

BIOINFORMATICS, 2018, 34 (23) :4064-4072

← 1 2 3 4 →