Investigating the prediction ability of survival models based on both clinical and omics data: two case studies

被引:36
作者
De Bin, Riccardo [1 ]
Sauerbrei, Willi [2 ]
Boulesteix, Anne-Laure [1 ]
机构
[1] Univ Munich, Dept Med Informat Biometry & Epidemiol, D-81377 Munich, Germany
[2] Univ Med Ctr Freiburg, Dept Med Biometry & Med Informat, Freiburg, Germany
关键词
clinical information; combining clinical and omics data; high-dimensional data; prediction models; survival analysis; GENE-EXPRESSION ANALYSIS; VARIABLE SELECTION; GENOMIC MODELS; TIME; CLASSIFICATION; PERFORMANCE; VALIDATION; REGRESSION; REGULARIZATION; MICROARRAYS;
D O I
10.1002/sim.6246
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
In biomedical literature, numerous prediction models for clinical outcomes have been developed based either on clinical data or, more recently, on high-throughput molecular data (omics data). Prediction models based on both types of data, however, are less common, although some recent studies suggest that a suitable combination of clinical and molecular information may lead to models with better predictive abilities. This is probably due to the fact that it is not straightforward to combine data with different characteristics and dimensions (poorly characterized high-dimensional omics data, well-investigated low-dimensional clinical data). In this paper, we analyze two publicly available datasets related to breast cancer and neuroblastoma, respectively, in order to show some possible ways to combine clinical and omics data into a prediction model of time-to-event outcome. Different strategies and statistical methods are exploited. The results are compared and discussed according to different criteria, including the discriminative ability of the models, computed on a validation dataset. Copyright (c) 2014 John Wiley & Sons, Ltd.
引用
收藏
页码:5310 / 5329
页数:20
相关论文
共 51 条
[1]  
[Anonymous], ALGORITHMS NATURE LI
[2]   Semi-supervised methods to predict patient survival from gene expression data [J].
Bair, E ;
Tibshirani, R .
PLOS BIOLOGY, 2004, 2 (04) :511-522
[3]  
Binder H, 2011, COXBOOST COX MODELS
[4]   Allowing for mandatory covariates in boosting estimation of sparse high-dimensional survival models [J].
Binder, Harald ;
Schumacher, Martin .
BMC BIOINFORMATICS, 2008, 9 (1)
[5]  
Boulesteix AL, 2013, 136 U MUN MUN
[6]   On representative and illustrative comparisons with real data in bioinformatics: response to the letter to the editor by Smith et al. [J].
Boulesteix, Anne-Laure .
BIOINFORMATICS, 2013, 29 (20) :2664-2666
[7]   Added predictive value of high-throughput molecular data to clinical data and its validation [J].
Boulesteix, Anne-Laure ;
Sauerbrei, Willi .
BRIEFINGS IN BIOINFORMATICS, 2011, 12 (03) :215-229
[8]   Survival prediction from clinico-genomic models - a comparative study [J].
Bovelstad, Hege M. ;
Nygard, Stale ;
Borgan, Ornulf .
BMC BIOINFORMATICS, 2009, 10
[9]   Boosting algorithms: Regularization, prediction and model fitting [J].
Buehlmann, Peter ;
Hothorn, Torsten .
STATISTICAL SCIENCE, 2007, 22 (04) :477-505
[10]   Boosting with the L2 loss:: Regression and classification [J].
Bühlmann, P ;
Yu, B .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2003, 98 (462) :324-339