Deep-learning approach to identifying cancer subtypes using high-dimensional genomic data

被引:88
作者
Chen, Runpu [1 ]
Yang, Le [1 ]
Goodison, Steve [2 ]
Sun, Yijun [1 ,3 ,4 ]
机构
[1] SUNY Buffalo, Dept Comp Sci & Engn, Buffalo, NY 14214 USA
[2] Mayo Clin, Dept Hlth Sci Res, Jacksonville, FL 32224 USA
[3] SUNY Buffalo, Dept Microbiol & Immunol, Buffalo, NY 14214 USA
[4] SUNY Buffalo, Dept Biostat, Buffalo, NY 14214 USA
关键词
BREAST-CANCER; MODEL; DISCOVERY; CLUSTERS;
D O I
10.1093/bioinformatics/btz769
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Cancer subtype classification has the potential to significantly improve disease prognosis and develop individualized patient management. Existing methods are limited by their ability to handle extremely high-dimensional data and by the influence of misleading, irrelevant factors, resulting in ambiguous and overlapping subtypes. Results: To address the above issues, we proposed a novel approach to disentangling and eliminating irrelevant factors by leveraging the power of deep learning. Specifically, we designed a deep-learning framework, referred to as DeepType, that performs joint supervised classification, unsupervised clustering and dimensionality reduction to learn cancer-relevant data representation with cluster structure. We applied DeepType to the METABRIC breast cancer dataset and compared its performance to state-of-the-art methods. DeepType significantly outperformed the existing methods, identifying more robust subtypes while using fewer genes. The new approach provides a framework for the derivation of more accurate and robust molecular cancer subtypes by using increasingly complex, multi-source data.
引用
收藏
页码:1476 / 1483
页数:8
相关论文
共 36 条
[1]   The Molecular Taxonomy of Primary Prostate Cancer [J].
Abeshouse, Adam ;
Ahn, Jaeil ;
Akbani, Rehan ;
Ally, Adrian ;
Amin, Samirkumar ;
Andry, Christopher D. ;
Annala, Matti ;
Aprikian, Armen ;
Armenia, Joshua ;
Arora, Arshi ;
Auman, J. Todd ;
Balasundaram, Miruna ;
Balu, Saianand ;
Barbieri, Christopher E. ;
Bauer, Thomas ;
Benz, Christopher C. ;
Bergeron, Alain ;
Beroukhim, Rameen ;
Berrios, Mario ;
Bivol, Adrian ;
Bodenheimer, Tom ;
Boice, Lori ;
Bootwalla, Moiz S. ;
dos Reis, Rodolfo Borges ;
Boutros, Paul C. ;
Bowen, Jay ;
Bowlby, Reanne ;
Boyd, Jeffrey ;
Bradley, Robert K. ;
Breggia, Anne ;
Brimo, Fadi ;
Bristow, Christopher A. ;
Brooks, Denise ;
Broom, Bradley M. ;
Bryce, Alan H. ;
Bubley, Glenn ;
Burks, Eric ;
Butterfield, Yaron S. N. ;
Button, Michael ;
Canes, David ;
Carlotti, Carlos G. ;
Carlsen, Rebecca ;
Carmel, Michel ;
Carroll, Peter R. ;
Carter, Scott L. ;
Cartun, Richard ;
Carver, Brett S. ;
Chan, June M. ;
Chang, Matthew T. ;
Chen, Yu .
CELL, 2015, 163 (04) :1011-1025
[2]   The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups [J].
Curtis, Christina ;
Shah, Sohrab P. ;
Chin, Suet-Feung ;
Turashvili, Gulisa ;
Rueda, Oscar M. ;
Dunning, Mark J. ;
Speed, Doug ;
Lynch, Andy G. ;
Samarajiwa, Shamith ;
Yuan, Yinyin ;
Graef, Stefan ;
Ha, Gavin ;
Haffari, Gholamreza ;
Bashashati, Ali ;
Russell, Roslin ;
McKinney, Steven ;
Langerod, Anita ;
Green, Andrew ;
Provenzano, Elena ;
Wishart, Gordon ;
Pinder, Sarah ;
Watson, Peter ;
Markowetz, Florian ;
Murphy, Leigh ;
Ellis, Ian ;
Purushotham, Arnie ;
Borresen-Dale, Anne-Lise ;
Brenton, James D. ;
Tavare, Simon ;
Caldas, Carlos ;
Aparicio, Samuel .
NATURE, 2012, 486 (7403) :346-352
[3]   CLUSTER SEPARATION MEASURE [J].
DAVIES, DL ;
BOULDIN, DW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1979, 1 (02) :224-227
[4]   A Three-Gene Model to Robustly Identify Breast Cancer Molecular Subtypes [J].
Haibe-Kains, Benjamin ;
Desmedt, Christine ;
Loi, Sherene ;
Culhane, Aedin C. ;
Bontempi, Gianluca ;
Quackenbush, John ;
Sotiriou, Christos .
JNCI-JOURNAL OF THE NATIONAL CANCER INSTITUTE, 2012, 104 (04) :311-325
[5]   On clustering validation techniques [J].
Halkidi, M ;
Batistakis, Y ;
Vazirgiannis, M .
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2001, 17 (2-3) :107-145
[6]   Hallmarks of Cancer: The Next Generation [J].
Hanahan, Douglas ;
Weinberg, Robert A. .
CELL, 2011, 144 (05) :646-674
[7]  
Hastie T., 2009, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, V2nd ed., DOI DOI 10.1007/B94608
[8]   A PROGNOSTIC INDEX IN PRIMARY BREAST-CANCER [J].
HAYBITTLE, JL ;
BLAMEY, RW ;
ELSTON, CW ;
JOHNSON, J ;
DOYLE, PJ ;
CAMPBELL, FC ;
NICHOLSON, RI ;
GRIFFITHS, K .
BRITISH JOURNAL OF CANCER, 1982, 45 (03) :361-366
[9]   Adjusting batch effects in microarray expression data using empirical Bayes methods [J].
Johnson, W. Evan ;
Li, Cheng ;
Rabinovic, Ariel .
BIOSTATISTICS, 2007, 8 (01) :118-127
[10]   Are clusters found in one dataset present in another dataset? [J].
Kapp, Amy V. ;
Tibshirani, Robert .
BIOSTATISTICS, 2007, 8 (01) :9-31