Comparing machine learning approaches to incorporate time-varying covariates in predicting cancer survival time

被引:11
作者
Cygu, Steve [1 ]
Seow, Hsien [2 ]
Dushoff, Jonathan [1 ,3 ]
Bolker, Benjamin M. [1 ,3 ]
机构
[1] McMaster Univ, Sch Computat Sci & Engn, 1280 Main St, Hamilton, ON L8S 4L8, Canada
[2] McMaster Univ, Dept Oncol, 1280 Main St, Hamilton, ON L8S 4L8, Canada
[3] McMaster Univ, Dept Biol, 1280 Main St, Hamilton, ON L8S 4L8, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
REGRESSION-MODELS;
D O I
10.1038/s41598-023-28393-7
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The Cox proportional hazards model is commonly used in evaluating risk factors in cancer survival data. The model assumes an additive, linear relationship between the risk factors and the log hazard. However, this assumption may be too simplistic. Further, failure to take time-varying covariates into account, if present, may lower prediction accuracy. In this retrospective, population-based, prognostic study of data from patients diagnosed with cancer from 2008 to 2015 in Ontario, Canada, we applied machine learning-based time-to-event prediction methods and compared their predictive performance in two sets of analyses: (1) yearly-cohort-based time-invariant and (2) fully time-varying covariates analysis. Machine learning-based methods-gradient boosting model (gbm), random survival forest (rsf), elastic net (enet), lasso and ridge-were compared to the traditional Cox proportional hazards (coxph) model and the prior study which used the yearly-cohort-based time-invariant analysis. Using Harrell's C index as our primary measure, we found that using both machine learning techniques and incorporating time-dependent covariates can improve predictive performance. Gradient boosting machine showed the best performance on test data in both time-invariant and time-varying covariates analysis.
引用
收藏
页数:10
相关论文
共 31 条
[21]   Machine learning models in breast cancer survival prediction [J].
Montazeri, Mitra ;
Montazeri, Mohadeseh ;
Montazeri, Mahdieh ;
Beigzadeh, Amin .
TECHNOLOGY AND HEALTH CARE, 2016, 24 (01) :31-42
[22]   Learning from data to predict future symptoms of oncology patients [J].
Papachristou, Nikolaos ;
Puschmann, Daniel ;
Barnaghi, Payam ;
Cooper, Bruce ;
Hu, Xiao ;
Maguire, Roma ;
Apostolidis, Kathi ;
Conley, Yvette P. ;
Hammer, Marilyn ;
Katsaragakis, Stylianos ;
Kober, Kord M. ;
Levine, Jon D. ;
McCann, Lisa ;
Patiraki, Elisabeth ;
Furlong, Eileen P. ;
Fox, Patricia A. ;
Paul, Steven M. ;
Ream, Emma ;
Wright, Fay ;
Miaskowski, Christine .
PLOS ONE, 2018, 13 (12)
[23]   Development and Validation of a Prognostic Survival Model With Patient-Reported Outcomes for Patients With Cancer [J].
Seow, Hsien ;
Tanuseputro, Peter ;
Barbera, Lisa ;
Earle, Craig ;
Guthrie, Dawn ;
Isenberg, Sarina ;
Juergens, Rosalyn ;
Myers, Jeffrey ;
Brouwers, Melissa ;
Sutradhar, Rinku .
JAMA NETWORK OPEN, 2020, 3 (04)
[24]   The Effect of Community-Based Specialist Palliative Care Teams on Place of Care [J].
Seow, Hsien ;
Dhaliwal, Gagan ;
Fassbender, Konrad ;
Rangrej, Jagadish ;
Brazil, Kevin ;
Fainsinger, Robin .
JOURNAL OF PALLIATIVE MEDICINE, 2016, 19 (01) :16-21
[25]   Trajectory of Performance Status and Symptom Scores for Patients With Cancer During the Last Six Months of Life [J].
Seow, Hsien ;
Barbera, Lisa ;
Sutradhar, Rinku ;
Howell, Doris ;
Dudgeon, Deborah ;
Atzema, Clare ;
Liu, Ying ;
Husain, Amna ;
Sussman, Jonathan ;
Earle, Craig .
JOURNAL OF CLINICAL ONCOLOGY, 2011, 29 (09) :1151-1158
[26]  
Simon N, 2011, J STAT SOFTW, V39, P1
[27]   A comparison of machine learning methods for survival analysis of high-dimensional clinical data for dementia prediction [J].
Spooner, Annette ;
Chen, Emily ;
Sowmya, Arcot ;
Sachdev, Perminder ;
Kochan, Nicole A. ;
Trollor, Julian ;
Brodaty, Henry .
SCIENTIFIC REPORTS, 2020, 10 (01)
[28]  
Therneau Terry M, 2024, CRAN
[29]  
Thomas L, 2014, J STAT SOFTW, V61, P1
[30]   Machine Learning for Survival Analysis: A Survey [J].
Wang, Ping ;
Li, Yan ;
Reddy, Chandan K. .
ACM COMPUTING SURVEYS, 2019, 51 (06)