Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis

被引:98
作者
Ding, Wubin [1 ,2 ]
Chen, Geng [1 ,2 ]
Shi, Tieliu [1 ,2 ,3 ]
机构
[1] East China Normal Univ, Ctr Bioinformat & Computat Biol, Shanghai 200241, Peoples R China
[2] East China Normal Univ, Inst Biomed Sci, Sch Life Sci, Shanghai 200241, Peoples R China
[3] Guangxi Med Univ, Guangxi Key Lab Biol Targeting Diag & Therapy Res, Natl Ctr Int Res Biol Targeting Diag & Therapy, Collaborat Innovat Ctr Targeting Tumor Diag & The, Nanning, Peoples R China
基金
国家高技术研究发展计划(863计划); 美国国家科学基金会;
关键词
DNA methylation; cancer diagnosis; prognosis; survival analysis; machine learning; pan-cancer; GENOME-WIDE ANALYSIS; REVEALS MOLECULAR CLASSIFICATION; GENE-EXPRESSION; TBX2; PROGRESSION; SIGNATURES; TISSUES;
D O I
10.1080/15592294.2019.1568178
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
DNA methylation status is closely associated with diverse diseases, and is generally more stable than gene expression, thus abnormal DNA methylation could be important biomarkers for tumor diagnosis, treatment and prognosis. However, the signatures regarding DNA methylation changes for pan-cancer diagnosis and prognosis are less explored. Here we systematically analyzed the genome-wide DNA methylation patterns in diverse TCGA cancers with machine learning. We identified seven CpG sites that could effectively discriminate tumor samples from adjacent normal tissue samples for 12 main cancers of TCGA (1216 samples, AUC > 0.99). Those seven potential diagnostic biomarkers were further validated in the other 9 different TCGA cancers and 4 independent datasets (AUC > 0.92). Three out of the seven CpG sites were correlated with cell division, DNA replication and cell cycle. We also identified 12 CpG sites that can effectively distinguish 26 different cancers (7605 samples), and the result was repeatable in independent datasets as well as two disparate tumors with metastases (micro-average AUC > 0.89). Furthermore, a series of potential signatures that could significantly predict the prognosis of tumor patients for 7 different cancer were identified via survival analysis (p-value < 1e-4). Collectively, DNA methylation patterns vary greatly between tumor and adjacent normal tissues, as well as among different types of cancers. Our identified signatures may aid the decision of clinical diagnosis and prognosis for pan-cancer and the potential cancer-specific biomarkers could be used to predict the primary site of metastatic breast and prostate cancers.
引用
收藏
页码:67 / 80
页数:14
相关论文
共 51 条
[1]   DNA Methylation Alterations Exhibit Intraindividual Stability and Interindividual Heterogeneity in Prostate Cancer Metastases [J].
Aryee, Martin J. ;
Liu, Wennuan ;
Engelmann, Julia C. ;
Nuhn, Philipp ;
Gurel, Meltem ;
Haffner, Michael C. ;
Esopi, David ;
Irizarry, Rafael A. ;
Getzenberg, Robert H. ;
Nelson, William G. ;
Luo, Jun ;
Xu, Jianfeng ;
Isaacs, William B. ;
Bova, G. Steven ;
Yegnasubramanian, Srinivasan .
SCIENCE TRANSLATIONAL MEDICINE, 2013, 5 (169)
[2]   Genome-wide DNA methylation analyses in lung adenocarcinomas: Association with EGFR, KRAS and TP53 mutation status, gene expression and prognosis [J].
Bjaanaes, Maria Moksnes ;
Fleischer, Thomas ;
Halvorsen, Ann Rita ;
Daunay, Antoine ;
Busato, Florence ;
Solberg, Steinar ;
Jorgensen, Lars ;
Kure, Elin ;
Edvardsen, Hege ;
Borresen-Dale, Anne-Lise ;
Brustugun, Odd Terje ;
Tost, Joerg ;
Kristensen, Vessela ;
Helland, Aslaug .
MOLECULAR ONCOLOGY, 2016, 10 (02) :330-343
[3]   LSimpute: accurate estimation of missing values in microarray data with least squares methods [J].
Bo, TH ;
Dysvik, J ;
Jonassen, I .
NUCLEIC ACIDS RESEARCH, 2004, 32 (03) :e34
[4]   Genome-wide analysis of long noncoding RNA (lncRNA) expression in colorectal cancer tissues from patients with liver metastasis [J].
Chen, Dong ;
Sun, Qiang ;
Cheng, Xiaofei ;
Zhang, Lufei ;
Song, Wei ;
Zhou, Dongkai ;
Lin, Jianjiang ;
Wang, Weilin .
CANCER MEDICINE, 2016, 5 (07) :1629-1639
[5]   Characterizing and annotating the genome using RNA-seq data [J].
Chen, Geng ;
Shi, Tieliu ;
Shi, Leming .
SCIENCE CHINA-LIFE SCIENCES, 2017, 60 (02) :116-125
[6]  
Chen J, 2016, Zhonghua Fu Chan Ke Za Zhi, V51, P126, DOI 10.3760/cma.j.issn.0529-567X.2016.02.009
[7]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[8]   Identification and validation of candidate epigenetic biomarkers in lung adenocarcinoma [J].
Daugaard, Iben ;
Dominguez, Diana ;
Kjeldsen, Tina E. ;
Kristensen, Lasse S. ;
Hager, Henrik ;
Wojdacz, Tomasz K. ;
Hansen, Lise Lotte .
SCIENTIFIC REPORTS, 2016, 6
[9]   DNA methylation at enhancers identifies distinct breast cancer lineages [J].
Fleischer, Thomas ;
Tekpli, Xavier ;
Mathelier, Anthony ;
Wang, Shixiong ;
Nebdal, Daniel ;
Dhakal, Hari P. ;
Sahlberg, Kristine Kleivi ;
Schlichting, Ellen ;
Borresen-Dale, Anne-Lise ;
Borgen, Elin ;
Naume, Bjorn ;
Eskeland, Ragnhild ;
Frigessi, Arnoldo ;
Tost, Jorg ;
Hurtado, Antoni ;
Kristensen, Vessela N. .
NATURE COMMUNICATIONS, 2017, 8
[10]   Greedy function approximation: A gradient boosting machine [J].
Friedman, JH .
ANNALS OF STATISTICS, 2001, 29 (05) :1189-1232