Towards clinically more relevant dissection of patient heterogeneity via survival-based Bayesian clustering

被引:19
作者
Ahmad, Ashar [1 ]
Froehlich, Holger [1 ,2 ]
机构
[1] Univ Bonn, Bonn Aachen Int Ctr Informat Technol, D-53127 Bonn, Germany
[2] UCB Biosci GmbH, D-40789 Monheim, Germany
关键词
HUMAN BREAST-CANCER; MIXTURE-MODELS; EXPRESSION; GLIOBLASTOMA; SELECTION; RECEPTOR; SUBTYPES; MARKER; GROWTH; TIME;
D O I
10.1093/bioinformatics/btx464
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Discovery of clinically relevant disease sub-types is of prime importance in personalized medicine. Disease sub-type identification has in the past often been explored in an unsupervised machine learning paradigm which involves clustering of patients based on available-omics data, such as gene expression. A follow-up analysis involves determining the clinical relevance of the molecular sub-types such as that reflected by comparing their disease progressions. The above methodology, however, fails to guarantee the separability of the sub-types based on their subtype-specific survival curves. We propose a new algorithm, Survival-based Bayesian Clustering (SBC) which simultaneously clusters heterogeneous-omics and clinical end point data (time to event) in order to discover clinically relevant disease subtypes. For this purpose we formulate a novel Hierarchical Bayesian Graphical Model which combines a Dirichlet Process Gaussian Mixture Model with an Accelerated Failure Time model. In this way we make sure that patients are grouped in the same cluster only when they show similar characteristics with respect to molecular features across data types (e.g. gene expression, mi-RNA) as well as survival times. We extensively test our model in simulation studies and apply it to cancer patient data from the Breast Cancer dataset and The Cancer Genome Atlas repository. Notably, our method is not only able to find clinically relevant sub-groups, but is also able to predict cluster membership and survival on test data in a better way than other competing methods.
引用
收藏
页码:3558 / 3566
页数:9
相关论文
共 54 条
  • [1] Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling
    Alizadeh, AA
    Eisen, MB
    Davis, RE
    Ma, C
    Lossos, IS
    Rosenwald, A
    Boldrick, JG
    Sabet, H
    Tran, T
    Yu, X
    Powell, JI
    Yang, LM
    Marti, GE
    Moore, T
    Hudson, J
    Lu, LS
    Lewis, DB
    Tibshirani, R
    Sherlock, G
    Chan, WC
    Greiner, TC
    Weisenburger, DD
    Armitage, JO
    Warnke, R
    Levy, R
    Wilson, W
    Grever, MR
    Byrd, JC
    Botstein, D
    Brown, PO
    Staudt, LM
    [J]. NATURE, 2000, 403 (6769) : 503 - 511
  • [2] MIXTURES OF DIRICHLET PROCESSES WITH APPLICATIONS TO BAYESIAN NONPARAMETRIC PROBLEMS
    ANTONIAK, CE
    [J]. ANNALS OF STATISTICS, 1974, 2 (06) : 1152 - 1174
  • [3] Semi-supervised methods to predict patient survival from gene expression data
    Bair, E
    Tibshirani, R
    [J]. PLOS BIOLOGY, 2004, 2 (04) : 511 - 522
  • [4] Gene-expression profiles predict survival of patients with lung adenocarcinoma
    Beer, DG
    Kardia, SLR
    Huang, CC
    Giordano, TJ
    Levin, AM
    Misek, DE
    Lin, L
    Chen, GA
    Gharib, TG
    Thomas, DG
    Lizyness, ML
    Kuick, R
    Hayasaka, S
    Taylor, JMG
    Iannettoni, MD
    Orringer, MB
    Hanash, S
    [J]. NATURE MEDICINE, 2002, 8 (08) : 816 - 824
  • [5] FERGUSON DISTRIBUTIONS VIA POLYA URN SCHEMES
    BLACKWELL, D
    MACQUEEN, JB
    [J]. ANNALS OF STATISTICS, 1973, 1 (02) : 353 - 355
  • [6] Estimation of covariance matrices based on hierarchical inverse-Wishart priors
    Bouriga, M.
    Feron, O.
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2013, 143 (04) : 795 - 808
  • [7] SKP1-CULLIN1-F-box (SCF)-mediated DRG2 degradation facilitated chemotherapeutic drugs induced apoptosis in hepatocellular carcinoma cells
    Chen Jie
    Shen Bai-yong
    Deng Xia-xing
    Zhan Qian
    Peng Cheng-hong
    [J]. BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2012, 420 (03) : 651 - 655
  • [8] The Proneural Molecular Signature Is Enriched in Oligodendrogliomas and Predicts Improved Survival among Diffuse Gliomas
    Cooper, Lee A. D.
    Gutman, David A.
    Long, Qi
    Johnson, Brent A.
    Cholleti, Sharath R.
    Kurc, Tahsin
    Saltz, Joel H.
    Brat, Daniel J.
    Moreno, Carlos S.
    [J]. PLOS ONE, 2010, 5 (09): : 1 - 9
  • [9] ADAM22, expressed in normal brain but not in high-grade gliomas, inhibits cellular proliferation via the disintegrin domain
    D'Abaco, GM
    Ng, K
    Paradiso, L
    Godde, NJ
    Kaye, A
    Novak, U
    [J]. NEUROSURGERY, 2006, 58 (01) : 179 - 186
  • [10] BAYESIAN ANALYSIS OF SOME NONPARAMETRIC PROBLEMS
    FERGUSON, TS
    [J]. ANNALS OF STATISTICS, 1973, 1 (02) : 209 - 230