Validation of clinical prediction models: what does the "calibration slope" really measure?

被引:88
作者
Stevens, Richard J. [1 ]
Poppe, Katrina K. [2 ]
机构
[1] Univ Oxford, Nuffield Dept Primary Care Hlth Sci, Oxford, England
[2] Univ Auckland, Fac Med & Hlth Sci, Auckland, New Zealand
关键词
Clinical prediction rule; Calibration; Validation; Discrimination; Spread; Slope; LOGISTIC-REGRESSION MODELS; PERFORMANCE;
D O I
10.1016/j.jclinepi.2019.09.016
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background and Objectives: Definitions of calibration, an aspect of model validation, have evolved over time. We examine use and interpretation of the statistic currently referred to as the calibration slope. Methods: The history of the term "calibration slope", and usage in papers published in 2016 and 2017, were reviewed. The behaviour of the slope in illustrative hypothetical examples and in two examples in the clinical literature was demonstrated. Results: The paper in which the statistic was proposed described it as a measure of "spread" and did not use the term "calibration". In illustrative examples, slope of 1 can be associated with good or bad calibration, and this holds true across different definitions of calibration. In data extracted from a previous study, the slope was correlated with discrimination, not overall calibration. Many authors of recent papers interpret the slope as a measure of calibration; a minority interpret it as a measure of discrimination or do not explicitly categorise it as either. Seventeen of thirty-three papers used the slope as the sole measure of calibration. Conclusion: Misunderstanding about this statistic has led to many papers in which it is the sole measure of calibration, which should be discouraged. (C) 2019 The Authors. Published by Elsevier Inc.
引用
收藏
页码:93 / 99
页数:7
相关论文
共 25 条
[1]  
Altman DG, 2000, STAT MED, V19, P453, DOI 10.1002/(SICI)1097-0258(20000229)19:4<453::AID-SIM350>3.0.CO
[2]  
2-5
[3]  
[Anonymous], 2009, Clinical prediction models: A practical approach to development, validation, and updating, DOI DOI 10.1007/978-0-387-77244-8
[4]   The Integrated Calibration Index (ICI) and related metrics for quantifying the calibration of logistic regression models [J].
Austin, Peter C. ;
Steyerberg, Ewout W. .
STATISTICS IN MEDICINE, 2019, 38 (21) :4051-4065
[5]   Geographic and temporal validity of prediction models: different approaches were useful to examine model performance [J].
Austin, Peter C. ;
van Klaveren, David ;
Vergouwe, Yvonne ;
Nieboer, Daan ;
Lee, Douglas S. ;
Steyerberg, Ewout W. .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 2016, 79 :76-85
[6]   Graphical assessment of internal and external calibration of logistic regression models by using loess smoothers [J].
Austin, Peter C. ;
Steyerberg, Ewout W. .
STATISTICS IN MEDICINE, 2014, 33 (03) :517-535
[7]   Prediction of life expectancy in patients with spinal epidural metastasis [J].
Bartels, Ronald H. M. A. ;
de Ruiter, Godard ;
Feuth, Ton ;
Arts, Mark P. .
NEURO-ONCOLOGY, 2016, 18 (01) :114-118
[8]  
COX DR, 1958, BIOMETRIKA, V45, P562, DOI 10.1093/biomet/45.3-4.562
[9]  
Harrell FE, 2015, SPRINGER SER STAT, DOI 10.1007/978-3-319-19425-7
[10]  
Harrell FE, 1996, STAT MED, V15, P361, DOI 10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO