Survival prediction from clinico-genomic models - a comparative study

被引:56
作者
Bovelstad, Hege M. [1 ]
Nygard, Stale [1 ,2 ]
Borgan, Ornulf [1 ]
机构
[1] Univ Oslo, Dept Math, NO-0316 Oslo, Norway
[2] Norwegian Comp Ctr, NO-0314 Oslo, Norway
关键词
GENE-EXPRESSION DATA; B-CELL LYMPHOMA; POSITIVE BREAST-CANCER; COX REGRESSION; INFORMATION; VALIDATION; PROGNOSIS; SELECTION; OUTCOMES; LASSO;
D O I
10.1186/1471-2105-10-413
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Survival prediction from high-dimensional genomic data is an active field in today's medical research. Most of the proposed prediction methods make use of genomic data alone without considering established clinical covariates that often are available and known to have predictive value. Recent studies suggest that combining clinical and genomic information may improve predictions, but there is a lack of systematic studies on the topic. Also, for the widely used Cox regression model, it is not obvious how to handle such combined models. Results: We propose a way to combine classical clinical covariates with genomic data in a clinico-genomic prediction model based on the Cox regression model. The prediction model is obtained by a simultaneous use of both types of covariates, but applying dimension reduction only to the high-dimensional genomic variables. We describe how this can be done for seven well-known prediction methods: variable selection, unsupervised and supervised principal components regression and partial least squares regression, ridge regression, and the lasso. We further perform a systematic comparison of the performance of prediction models using clinical covariates only, genomic data only, or a combination of the two. The comparison is done using three survival data sets containing both clinical information and microarray gene expression data. Matlab code for the clinico-genomic prediction methods is available at http://www.med.uio.no/imb/stat/bmms/software/clinico-genomic/. Conclusions: Based on our three data sets, the comparison shows that established clinical covariates will often lead to better predictions than what can be obtained from genomic data alone. In the cases where the genomic models are better than the clinical, ridge regression is used for dimension reduction. We also find that the clinico-genomic models tend to outperform the models based on only genomic data. Further, clinico-genomic models and the use of ridge regression gives for all three data sets better predictions than models based on the clinical covariates alone.
引用
收藏
页数:9
相关论文
共 36 条
[21]   L1-regularization path algorithm for generalized linear models [J].
Park, Mee Young ;
Hastie, Trevor .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2007, 69 :659-677
[22]   Integrated modeling of clinical and gene expression information for personalized prediction of disease outcomes [J].
Pittman, J ;
Huang, E ;
Dressman, H ;
Horng, CF ;
Cheng, SH ;
Tsou, MH ;
Chen, CM ;
Bild, A ;
Iversen, ES ;
Huang, AT ;
Nevins, JR ;
West, M .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (22) :8431-8436
[23]   The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma [J].
Rosenwald, A ;
Wright, G ;
Chan, WC ;
Connors, JM ;
Campo, E ;
Fisher, RI ;
Gascoyne, RD ;
Muller-Hermelink, HK ;
Smeland, EB ;
Staudt, LM .
NEW ENGLAND JOURNAL OF MEDICINE, 2002, 346 (25) :1937-1947
[24]   Microarray gene expression data with linked survival phenotypes: diffuse large-B-cell lymphoma revisited [J].
Segal, MR .
BIOSTATISTICS, 2006, 7 (02) :268-285
[25]   A PREDICTIVE MODEL FOR AGGRESSIVE NON-HODGKINS-LYMPHOMA [J].
SHIPP, MA ;
HARRINGTON, DP ;
ANDERSON, JR ;
ARMITAGE, JO ;
BONADONNA, G ;
BRITTINGER, G ;
CABANILLAS, F ;
CANELLOS, GP ;
COIFFIER, B ;
CONNORS, JM ;
COWAN, RA ;
CROWTHER, D ;
DAHLBERG, S ;
ENGELHARD, M ;
FISHER, RI ;
GISSELBRECHT, C ;
HORNING, SJ ;
LEPAGE, E ;
LISTER, TA ;
MEERWALDT, JH ;
MONTSERRAT, E ;
NISSEN, NI ;
OKEN, MM ;
PETERSON, BA ;
TONDINI, C ;
VELASQUEZ, WA ;
YEAP, BY .
NEW ENGLAND JOURNAL OF MEDICINE, 1993, 329 (14) :987-994
[26]   Improved breast cancer prognosis through the combination of clinical and genetic markers [J].
Sun, Yijun ;
Goodison, Steve ;
Li, Jian ;
Liu, Li ;
Farmerie, William .
BIOINFORMATICS, 2007, 23 (01) :30-37
[27]   A consensus prognostic gene expression classifier for ER positive breast cancer [J].
Teschendorff, Andrew E. ;
Naderi, Ali ;
Barbosa-Morais, Nuno L. ;
Pinder, Sarah E. ;
Ellis, Ian O. ;
Aparicio, Sam ;
Brenton, James D. ;
Caldas, Carlos .
GENOME BIOLOGY, 2006, 7 (10)
[28]  
Tibshirani R, 1997, STAT MED, V16, P385, DOI 10.1002/(SICI)1097-0258(19970228)16:4<385::AID-SIM380>3.0.CO
[29]  
2-3