Predicting cancer origins with a DNA methylation-based deep neural network model

被引:36
作者
Zheng, Chunlei [1 ]
Xu, Rong [1 ]
机构
[1] Case Western Reserve Univ, Sch Med, Ctr Artificial Intelligence Drug Discovery, Cleveland, OH USA
来源
PLOS ONE | 2020年 / 15卷 / 05期
基金
美国国家卫生研究院;
关键词
UNKNOWN PRIMARY; TUMOR-TISSUE; CARCINOMA; IDENTIFICATION; VALIDATION; PROFILE; MICRORNA; MARKERS; ADENOCARCINOMA; MULTICENTER;
D O I
10.1371/journal.pone.0226461
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Cancer origin determination combined with site-specific treatment of metastatic cancer patients is critical to improve patient outcomes. Existing pathology and gene expression-based techniques often have limited performance. In this study, we developed a deep neural network (DNN)-based classifier for cancer origin prediction using DNA methylation data of 7,339 patients of 18 different cancer origins from The Cancer Genome Atlas (TCGA). This DNN model was evaluated using four strategies: (1) when evaluated by 10-fold cross-validation, it achieved an overall specificity of 99.72% (95% CI 99.69%-99.75%) and sensitivity of 92.59% (95% CI 91.87%-93.30%); (2) when tested on hold-out testing data of 1,468 patients, the model had an overall specificity of 99.83% and sensitivity of 95.95%; (3) when tested on 143 metastasized cancer patients (12 cancer origins), the model achieved an overall specificity of 99.47% and sensitivity of 95.95%; and (4) when tested on an independent dataset of 581 samples (10 cancer origins), the model achieved overall specificity of 99.91% and sensitivity of 93.43%. Compared to existing pathology and gene expression-based techniques, the DNA methylation-based DNN classifier showed higher performance and had the unique advantage of easy implementation in clinical settings. In summary, our study shows that DNA methylation-based DNN models has potential in both diagnosis of cancer of unknown primary and identification of cancer cell types of circulating tumor cells.
引用
收藏
页数:17
相关论文
共 47 条
  • [1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [2] Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning
    Alipanahi, Babak
    Delong, Andrew
    Weirauch, Matthew T.
    Frey, Brendan J.
    [J]. NATURE BIOTECHNOLOGY, 2015, 33 (08) : 831 - +
  • [3] [Anonymous], INTEGRATED DEEP LEAR, DOI DOI 10.1101/095653
  • [4] [Anonymous], 2017, BRIEF BIOINFORM, DOI DOI 10.1093/bib/bbw068
  • [5] [Anonymous], 2010, JMLR WORKSHOP C P, DOI DOI 10.1007/BFB0056905
  • [6] Sensitive detection of rare disease-associated cell subsets via representation learning
    Arvaniti, Eirini
    Claassen, Manfred
    [J]. NATURE COMMUNICATIONS, 2017, 8
  • [7] Brown RW, 1997, AM J CLIN PATHOL, V107, P12
  • [8] Gene expression inference with deep learning
    Chen, Yifei
    Li, Yi
    Narayan, Rajiv
    Subramanian, Aravind
    Xie, Xiaohui
    [J]. BIOINFORMATICS, 2016, 32 (12) : 1832 - 1839
  • [9] Opportunities and obstacles for deep learning in biology and medicine
    Ching, Travers
    Himmelstein, Daniel S.
    Beaulieu-Jones, Brett K.
    Kalinin, Alexandr A.
    Do, Brian T.
    Way, Gregory P.
    Ferrero, Enrico
    Agapow, Paul-Michael
    Zietz, Michael
    Hoffman, Michael M.
    Xie, Wei
    Rosen, Gail L.
    Lengerich, Benjamin J.
    Israeli, Johnny
    Lanchantin, Jack
    Woloszynek, Stephen
    Carpenter, Anne E.
    Shrikumar, Avanti
    Xu, Jinbo
    Cofer, Evan M.
    Lavender, Christopher A.
    Turaga, Srinivas C.
    Alexandari, Amr M.
    Lu, Zhiyong
    Harris, David J.
    DeCaprio, Dave
    Qi, Yanjun
    Kundaje, Anshul
    Peng, Yifan
    Wiley, Laura K.
    Segler, Marwin H. S.
    Boca, Simina M.
    Swamidass, S. Joshua
    Huang, Austin
    Gitter, Anthony
    Greene, Casey S.
    [J]. JOURNAL OF THE ROYAL SOCIETY INTERFACE, 2018, 15 (141)
  • [10] TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data
    Colaprico, Antonio
    Silva, Tiago C.
    Olsen, Catharina
    Garofano, Luciano
    Cava, Claudia
    Garolini, Davide
    Sabedot, Thais S.
    Malta, Tathiane M.
    Pagnotta, Stefano M.
    Castiglioni, Isabella
    Ceccarelli, Michele
    Bontempi, Gianluca
    Noushmehr, Houtan
    [J]. NUCLEIC ACIDS RESEARCH, 2016, 44 (08) : e71