Double-blind evaluation and benchmarking of survival models in a multi-centre study

被引:13
作者
Taktak, A.
Antolini, L.
Aung, M.
Boracchi, P.
Campbell, I.
Damato, B.
Ifeachor, E.
Lama, N.
Lisboa, P.
Setzkorn, C.
Stalbovskaya, V.
Biganzoli, E.
机构
[1] Royal Liverpool Univ Hosp, Dept Clin Engn, Liverpool, Merseyside, England
[2] Ist Nazl Studio & Cura Tumori, Unita Stat Med & Biometria, I-20133 Milan, Italy
[3] Liverpool John Moores Univ, Sch Math & Comp Sci, Liverpool L3 5UX, Merseyside, England
[4] Univ Milan, Ist Stat Med & Biometria, I-20122 Milan, Italy
[5] IC Stat Serv, Wirral, Merseyside, England
[6] Royal Liverpool Univ Hosp, St Pauls Eye Unit, Liverpool, Merseyside, England
[7] Univ Plymouth, Sch Commun & Elect, Plymouth PL4 8AA, Devon, England
关键词
evaluation studies; double-blind study; multi-centre studies; survival analysis; uveal neoplasms;
D O I
10.1016/j.compbiomed.2006.10.001
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Accurate modelling of time-to-event data is of particular importance for both exploratory and predictive analysis in cancer, and can have a direct impact on clinical care. This study presents a detailed double-blind evaluation of the accuracy in out-of-sample prediction of mortality from two generic non-linear models, using artificial neural networks benchmarked against a partial logistic spline, log-normal and COX regression models. A data set containing 2880 samples was shared over the Internet using a purpose-built secure environment called GEOCONDA (www.geoconda.com). The evaluation was carried out in three parts. The first was a comparison between the predicted survival estimates for each of the four survival groups defined by the TNM staging system, against the empirical estimates derived by the Kaplan-Meier method. The second approach focused on the accurate prediction of survival over time, quantified with the time dependent C index (C-td). Finally, calibration plots were obtained over the range of follow-up and tested using a generalization of the Hosmer-Lemeshow test. All models showed satisfactory performance, with values of C-td of about 0.7. None of the models showed a systematic tendency towards over/under estimation of the observed survival at tau = 3 and 5 years. At tau = 10 years, all models underestimated the observed survival, except for COX regression which returned an overestimate. The study presents a robust and unbiased benchmarking methodology using a bespoke web facility. It was concluded that powerful, recent flexible modelling algorithms show a comparative predictive performance to that of more established methods from the medical and biological literature, for the reference data set. (c) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1108 / 1120
页数:13
相关论文
共 34 条
[1]  
[Anonymous], 2004, Biostatistics: A Methodology for the Health Sciences
[2]   A time-dependent discrimination index for survival data [J].
Antolini, L ;
Boracchi, P ;
Biganzoli, E .
STATISTICS IN MEDICINE, 2005, 24 (24) :3927-3944
[3]  
ARMITAGE P, 2000, STAT METHOS MED RES
[4]  
Biganzoli E, 1998, STAT MED, V17, P1169, DOI 10.1002/(SICI)1097-0258(19980530)17:10<1169::AID-SIM796>3.3.CO
[5]  
2-4
[6]  
BISHOP CM, 2004, ERROR FUNCTIONS NEUR, P230
[7]  
Burke HB, 1997, CANCER, V79, P857, DOI 10.1002/(SICI)1097-0142(19970215)79:4<857::AID-CNCR24>3.0.CO
[8]  
2-Y
[9]  
D'Agostino RB, 2004, HANDB STAT, V23, P1
[10]  
De Laurentiis M, 1999, CLIN CANCER RES, V5, P4133