In-depth mining of clinical data: the construction of clinical prediction model with R

被引:239
作者
Zhou, Zhi-Rui [1 ]
Wang, Wei-Wei [2 ,3 ]
Li, Yan [4 ]
Jin, Kai-Rui [5 ]
Wang, Xuan-Yi [5 ]
Wang, Zi-Wei [6 ]
Chen, Yi-Shan [7 ]
Wang, Shao-Jia [3 ,8 ]
Hu, Jing [7 ]
Zhang, Hui-Na [7 ]
Huang, Po [7 ]
Zhao, Guo-Zhen [7 ]
Chen, Xing-Xing [5 ]
Li, Bo [7 ]
Zhang, Tian-Song [9 ]
机构
[1] Fudan Univ, Huashan Hosp, Shanghai Med Coll, Dept Radiotherapy, Shanghai 200040, Peoples R China
[2] Kunming Med Univ, Affiliated Hosp 3, Dept Thorac Surg, Kunming 650118, Yunnan, Peoples R China
[3] Yunnan Prov Tumor Hosp, Kunming 650118, Yunnan, Peoples R China
[4] Harbin Med Univ, Affiliated Hosp 4, Dept Anesthesiol, Harbin 150001, Heilongjiang, Peoples R China
[5] Fudan Univ, Shanghai Med Coll, Shanghai Canc Ctr, Dept Radiat Oncol, Shanghai 200040, Peoples R China
[6] Second Mil Med Univ, Changhai Hosp, Dept Urol, Shanghai 200040, Peoples R China
[7] Capital Med Univ, Beijing Hosp Tradit Chinese Med, Beijing Inst Tradit Chinese Med, Beijing 100010, Peoples R China
[8] Kunming Med Univ, Affiliated Hosp 3, Dept Gynecol Oncol, Kunming 650118, Yunnan, Peoples R China
[9] Fudan Univ, Jingan Dist Cent Hosp, Internal Med Tradit Chinese Med Dept, Shanghai 200040, Peoples R China
基金
中国国家自然科学基金;
关键词
Clinical prediction models; R; statistical computing; OF-FIT TEST; BIG DATA; REGRESSION; SUBDISTRIBUTION; DISCRIMINATION; IMPROVEMENT; NOMOGRAM; HISTORY; CURVE; TESTS;
D O I
10.21037/atm.2019.08.63
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
This article is the series of methodology of clinical prediction model construction (total 16 sections of this methodology series). The first section mainly introduces the concept, current application status, construction methods and processes, classification of clinical prediction models, and the necessary conditions for conducting such researches and the problems currently faced. The second episode of these series mainly concentrates on the screening method in multivariate regression analysis. The third section mainly introduces the construction method of prediction models based on Logistic regression and Nomogram drawing. The fourth episode mainly concentrates on Cox proportional hazards regression model and Nomogram drawing. The fifth Section of the series mainly introduces the calculation method of C-Statistics in the logistic regression model. The sixth section mainly introduces two common calculation methods for C-Index in Cox regression based on R. The seventh section focuses on the principle and calculation methods of Net Reclassification Index (NRI) using R. The eighth section focuses on the principle and calculation methods of IDI (Integrated Discrimination Index) using R. The ninth section continues to explore the evaluation method of clinical utility after predictive model construction: Decision Curve Analysis. The tenth section is a supplement to the previous section and mainly introduces the Decision Curve Analysis of survival outcome data. The eleventh section mainly discusses the external validation method of Logistic regression model. The twelfth mainly discusses the in-depth evaluation of Cox regression model based on R, including calculating the concordance index of discrimination (C-index) in the validation data set and drawing the calibration curve. The thirteenth section mainly introduces how to deal with the survival data outcome using competitive risk model with R. The fourteenth section mainly introduces how to draw the nomogram of the competitive risk model with R. The fifteenth section of the series mainly discusses the identification of outliers and the interpolation of missing values. The sixteenth section of the series mainly introduced the advanced variable selection methods in linear model, such as Ridge regression and LASSO regression.
引用
收藏
页数:96
相关论文
共 56 条
[1]   Clinical prediction rules [J].
Adams, Simon T. ;
Leveson, Stephen H. .
BMJ-BRITISH MEDICAL JOURNAL, 2012, 344
[2]   Discrimination and Calibration of Clinical Prediction Models Users' Guides to the Medical Literature [J].
Alba, Ana Carolina ;
Agoritsas, Thomas ;
Walsh, Michael ;
Hanna, Steven ;
Iorio, Alfonso ;
Devereaux, P. J. ;
McGinn, Thomas ;
Guyatt, Gordon .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2017, 318 (14) :1377-1384
[3]   Big Data and machine learning in radiation oncology: State of the art and future prospects [J].
Bibault, Jean-Emmanuel ;
Giraud, Philippe ;
Burgun, Anita .
CANCER LETTERS, 2016, 382 (01) :110-117
[4]  
Collins GS, 2015, ANN INTERN MED, V162, P55, DOI [10.1016/j.jclinepi.2014.11.010, 10.1038/bjc.2014.639, 10.1136/bmj.g7594, 10.1016/j.eururo.2014.11.025, 10.7326/M14-0697, 10.1186/s12916-014-0241-z, 10.1002/bjs.9736, 10.7326/M14-0698]
[5]   Reclassification calibration test for censored survival data: performance and comparison to goodness-of-fit criteria [J].
Olga V. Demler ;
Nina P. Paynter ;
Nancy R. Cook .
Diagnostic and Prognostic Research, 2 (1)
[6]   A goodness-of-fit test for the proportional odds regression model [J].
Fagerland, Morten W. ;
Hosmer, David W. .
STATISTICS IN MEDICINE, 2013, 32 (13) :2235-2249
[7]   A goodness-of-fit test for the ordered stereotype model [J].
Fernandez, Daniel ;
Liu, Ivy .
STATISTICS IN MEDICINE, 2016, 35 (25) :4660-4696
[8]   A proportional hazards model for the subdistribution of a competing risk [J].
Fine, JP ;
Gray, RJ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1999, 94 (446) :496-509
[9]   Time-dependent covariates in the Cox proportional-hazards regression model [J].
Fisher, LD ;
Lin, DY .
ANNUAL REVIEW OF PUBLIC HEALTH, 1999, 20 :145-157
[10]   Cause-Specific Cumulative Incidence Estimation and the Fine and Gray Model Under Both Left Truncation and Right Censoring [J].
Geskus, Ronald B. .
BIOMETRICS, 2011, 67 (01) :39-49