Cubic splines to model relationships between continuous variables and outcomes: a guide for clinicians

被引:317
作者
Gauthier, J. [1 ,2 ]
Wu, Q. V. [1 ,3 ,4 ]
Gooley, T. A. [1 ,3 ,4 ]
机构
[1] Fred Hutchinson Canc Res Ctr, Div Clin Res, 1124 Columbia St, Seattle, WA 98104 USA
[2] Univ Washington, Dept Med, Div Med Oncol, Seattle, WA 98195 USA
[3] Fred Hutchinson Canc Res Ctr, Div Publ Hlth Sci, 1124 Columbia St, Seattle, WA 98104 USA
[4] Fred Hutchinson Canc Res Ctr, Clin Biostat, 1124 Columbia St, Seattle, WA 98104 USA
关键词
REGRESSION;
D O I
10.1038/s41409-019-0679-x
中图分类号
Q6 [生物物理学];
学科分类号
071011 ;
摘要
Series Editors' Note We are pleased to add this typescript to the Bone Marrow Transplantation Statistics Series. We realize the term cubic splines may be a bit off-putting to some readers, but stay with us and don't get lost in polynomial equations. What the authors describe is important conceptually and in practice. Have you ever tried to buy a new pair of hiking boots? Getting the correct fit is critical; shoes that are too small or too large will get you in big trouble! Now imagine if hiking shoes came in only 2 sizes, small and large, and your foot size was somewhere in between. You are in trouble. Sailing perhaps? Transplant physicians are often interested in the association between two variables, say pre-transplant measurable residual disease (MRD) test state and an outcome, say cumulative incidence of relapse (CIR). We typically reduce the results of an MRD test to a binary, negative or positive, often defined by an arbitrary cut-point. However, MRD state is a continuous biological variable, and reducing it to a binary discards what may be important, useful data when we try to correlate it with CIR. Put otherwise, we may miss the trees from the forest. Another way to look at splines is a technique to make smooth curves out of irregular data points. Consider, for example, trying to describe the surface of an egg. You could do it with a series of straight lines connecting points on the egg surface but a much better representation would be combining groups of points into curves and then combining the curves. To prove this try drawing an egg using the draw feature in Microsoft Powerpoint; you are making splines. Gauthier and co-workers show us how to use cubic splines to get the maximum information from data points, which may, unkindly, not lend themselves to dichotomization or a best fit line. Please read on. We hope readers will find their typescript interesting and exciting, and that it will give them a new way to think about how to analyse data. And no, a spline is not a bunch of cactus spines. Robert Peter Gale, Imperial College London, and Mei-Jie Zhang, Medical College of Wisconsin and CIBMTR.
引用
收藏
页码:675 / 680
页数:6
相关论文
共 8 条
[1]   Statistics notes - The cost of dichotomising continuous variables [J].
Altman, DG ;
Royston, P .
BRITISH MEDICAL JOURNAL, 2006, 332 (7549) :1080-1080
[2]   DANGERS OF USING OPTIMAL CUTPOINTS IN THE EVALUATION OF PROGNOSTIC FACTORS [J].
ALTMAN, DG ;
LAUSEN, B ;
SAUERBREI, W ;
SCHUMACHER, M .
JOURNAL OF THE NATIONAL CANCER INSTITUTE, 1994, 86 (11) :829-835
[3]  
Box GEP., 1979, Robustness in statistics, P201, DOI [10.1016/B978-0-12-438150-6.50018-2, DOI 10.1016/B978-0-12-438150-6.50018-2]
[4]  
Devlin TF., 1986, P 11 ANN SAS US GROU, P646
[5]  
Harrell FE, 2015, SPRINGER SER STAT, DOI 10.1007/978-3-319-19425-7
[6]   Gaining more flexibility in Cox proportional hazards regression models with cubic spline functions [J].
Heinzl, H ;
Kaider, A .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 1997, 54 (03) :201-208
[7]   Dichotomizing continuous predictors in multiple regression: a bad idea [J].
Royston, P ;
Altman, DG ;
Sauerbrei, W .
STATISTICS IN MEDICINE, 2006, 25 (01) :127-141
[8]  
Stone CJ, 1985, Proceedings of the statistical computing section ASA, V45, P48