Deep-learning approach to identifying cancer subtypes using high-dimensional genomic data

被引:78
作者
Chen, Runpu [1 ]
Yang, Le [1 ]
Goodison, Steve [2 ]
Sun, Yijun [1 ,3 ,4 ]
机构
[1] SUNY Buffalo, Dept Comp Sci & Engn, Buffalo, NY 14214 USA
[2] Mayo Clin, Dept Hlth Sci Res, Jacksonville, FL 32224 USA
[3] SUNY Buffalo, Dept Microbiol & Immunol, Buffalo, NY 14214 USA
[4] SUNY Buffalo, Dept Biostat, Buffalo, NY 14214 USA
关键词
BREAST-CANCER; MODEL; DISCOVERY; CLUSTERS;
D O I
10.1093/bioinformatics/btz769
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Cancer subtype classification has the potential to significantly improve disease prognosis and develop individualized patient management. Existing methods are limited by their ability to handle extremely high-dimensional data and by the influence of misleading, irrelevant factors, resulting in ambiguous and overlapping subtypes. Results: To address the above issues, we proposed a novel approach to disentangling and eliminating irrelevant factors by leveraging the power of deep learning. Specifically, we designed a deep-learning framework, referred to as DeepType, that performs joint supervised classification, unsupervised clustering and dimensionality reduction to learn cancer-relevant data representation with cluster structure. We applied DeepType to the METABRIC breast cancer dataset and compared its performance to state-of-the-art methods. DeepType significantly outperformed the existing methods, identifying more robust subtypes while using fewer genes. The new approach provides a framework for the derivation of more accurate and robust molecular cancer subtypes by using increasingly complex, multi-source data.
引用
收藏
页码:1476 / 1483
页数:8
相关论文
共 36 条
  • [1] The Molecular Taxonomy of Primary Prostate Cancer
    Abeshouse, Adam
    Ahn, Jaeil
    Akbani, Rehan
    Ally, Adrian
    Amin, Samirkumar
    Andry, Christopher D.
    Annala, Matti
    Aprikian, Armen
    Armenia, Joshua
    Arora, Arshi
    Auman, J. Todd
    Balasundaram, Miruna
    Balu, Saianand
    Barbieri, Christopher E.
    Bauer, Thomas
    Benz, Christopher C.
    Bergeron, Alain
    Beroukhim, Rameen
    Berrios, Mario
    Bivol, Adrian
    Bodenheimer, Tom
    Boice, Lori
    Bootwalla, Moiz S.
    dos Reis, Rodolfo Borges
    Boutros, Paul C.
    Bowen, Jay
    Bowlby, Reanne
    Boyd, Jeffrey
    Bradley, Robert K.
    Breggia, Anne
    Brimo, Fadi
    Bristow, Christopher A.
    Brooks, Denise
    Broom, Bradley M.
    Bryce, Alan H.
    Bubley, Glenn
    Burks, Eric
    Butterfield, Yaron S. N.
    Button, Michael
    Canes, David
    Carlotti, Carlos G.
    Carlsen, Rebecca
    Carmel, Michel
    Carroll, Peter R.
    Carter, Scott L.
    Cartun, Richard
    Carver, Brett S.
    Chan, June M.
    Chang, Matthew T.
    Chen, Yu
    [J]. CELL, 2015, 163 (04) : 1011 - 1025
  • [2] The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups
    Curtis, Christina
    Shah, Sohrab P.
    Chin, Suet-Feung
    Turashvili, Gulisa
    Rueda, Oscar M.
    Dunning, Mark J.
    Speed, Doug
    Lynch, Andy G.
    Samarajiwa, Shamith
    Yuan, Yinyin
    Graef, Stefan
    Ha, Gavin
    Haffari, Gholamreza
    Bashashati, Ali
    Russell, Roslin
    McKinney, Steven
    Langerod, Anita
    Green, Andrew
    Provenzano, Elena
    Wishart, Gordon
    Pinder, Sarah
    Watson, Peter
    Markowetz, Florian
    Murphy, Leigh
    Ellis, Ian
    Purushotham, Arnie
    Borresen-Dale, Anne-Lise
    Brenton, James D.
    Tavare, Simon
    Caldas, Carlos
    Aparicio, Samuel
    [J]. NATURE, 2012, 486 (7403) : 346 - 352
  • [3] CLUSTER SEPARATION MEASURE
    DAVIES, DL
    BOULDIN, DW
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1979, 1 (02) : 224 - 227
  • [4] A Three-Gene Model to Robustly Identify Breast Cancer Molecular Subtypes
    Haibe-Kains, Benjamin
    Desmedt, Christine
    Loi, Sherene
    Culhane, Aedin C.
    Bontempi, Gianluca
    Quackenbush, John
    Sotiriou, Christos
    [J]. JNCI-JOURNAL OF THE NATIONAL CANCER INSTITUTE, 2012, 104 (04): : 311 - 325
  • [5] On clustering validation techniques
    Halkidi, M
    Batistakis, Y
    Vazirgiannis, M
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2001, 17 (2-3) : 107 - 145
  • [6] Hallmarks of Cancer: The Next Generation
    Hanahan, Douglas
    Weinberg, Robert A.
    [J]. CELL, 2011, 144 (05) : 646 - 674
  • [7] Hastie T., 2017, Data mining, inference, V2nd ed, DOI DOI 10.1007/B94608
  • [8] A PROGNOSTIC INDEX IN PRIMARY BREAST-CANCER
    HAYBITTLE, JL
    BLAMEY, RW
    ELSTON, CW
    JOHNSON, J
    DOYLE, PJ
    CAMPBELL, FC
    NICHOLSON, RI
    GRIFFITHS, K
    [J]. BRITISH JOURNAL OF CANCER, 1982, 45 (03) : 361 - 366
  • [9] Adjusting batch effects in microarray expression data using empirical Bayes methods
    Johnson, W. Evan
    Li, Cheng
    Rabinovic, Ariel
    [J]. BIOSTATISTICS, 2007, 8 (01) : 118 - 127
  • [10] Are clusters found in one dataset present in another dataset?
    Kapp, Amy V.
    Tibshirani, Robert
    [J]. BIOSTATISTICS, 2007, 8 (01) : 9 - 31