Fostering reproducibility and generalizability in machine learning for clinical prediction modeling in spine surgery

被引:35
作者
Azad, Tej D. [1 ]
Ehresman, Jeff [1 ]
Ahmed, Ali Karim [1 ]
Staartjes, Victor E. [2 ,3 ]
Lubelski, Daniel [1 ]
Stienen, Martin N. [2 ,3 ]
Veeravagu, Anand [4 ]
Ratliff, John K. [4 ]
机构
[1] Johns Hopkins Univ Hosp, Dept Neurosurg, 1800 Orleans St, Baltimore, MD 21287 USA
[2] Univ Zurich, Machine Intelligence Clin Neurosci MICN Lab, Clin Neurosci Ctr, Zurich, Switzerland
[3] Univ Zurich Hosp, Dept Neurosurg, Zurich, Switzerland
[4] Stanford Univ, Sch Med, Dept Neurosurg, Stanford, CA 94305 USA
关键词
Machine learning; Predictive modeling; Overfitting; Reproducibility; CLASSIFICATION;
D O I
10.1016/j.spinee.2020.10.006
中图分类号
R74 [神经病学与精神病学];
学科分类号
摘要
As the use of machine learning algorithms in the development of clinical prediction models has increased, researchers are becoming more aware of the deleterious effects that stem from the lack of reporting standards. One of the most obvious consequences is the insufficient reproducibility found in current prediction models. In an attempt to characterize methods to improve reproducibility and to allow for better clinical performance, we utilize a previously proposed taxonomy that separates reproducibility into 3 components: technical, statistical, and conceptual reproducibility. By following this framework, we discuss common errors that lead to poor reproducibility, highlight the importance of generalizability when evaluating a ML model's performance, and provide suggestions to optimize generalizability to ensure adequate performance. These efforts are a necessity before such models are applied to patient care. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页码:1610 / 1616
页数:7
相关论文
共 48 条
[1]   Discrimination and Calibration of Clinical Prediction Models Users' Guides to the Medical Literature [J].
Alba, Ana Carolina ;
Agoritsas, Thomas ;
Walsh, Michael ;
Hanna, Steven ;
Iorio, Alfonso ;
Devereaux, P. J. ;
McGinn, Thomas ;
Guyatt, Gordon .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2017, 318 (14) :1377-1384
[2]   Artificial Intelligence Based Hierarchical Clustering of Patient Types and Intervention Categories in Adult Spinal Deformity Surgery Towards a New Classification Scheme that Predicts Quality and Value [J].
Ames, Christopher P. ;
Smith, Justin S. ;
Pellise, Ferran ;
Kelly, Michael ;
Alanay, Ahmet ;
Acaroglu, Emre ;
Sanchez Perez-Grueso, Francisco Javier ;
Kleinstuck, Frank ;
Obeid, Ibrahim ;
Vila-Casademunt, Alba ;
Shaffrey, Christopher I., Jr. ;
Burton, Douglas ;
Lafage, Virginie ;
Schwab, Frank ;
Shaffrey Sr, Christopher I. Sr ;
Bess, Shay ;
Serra-Burriel, Miguel .
SPINE, 2019, 44 (13) :915-926
[3]  
Ba LJ, 2014, ADV NEUR IN, V27
[4]  
Baker M, 2016, NATURE, V533, P452, DOI 10.1038/533452a
[5]   Challenges to the Reproducibility of Machine Learning Models in Health Care [J].
Beam, Andrew L. ;
Manrai, Arjun K. ;
Ghassemi, Marzyeh .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2020, 323 (04) :305-306
[6]   Raise standards for preclinical cancer research [J].
Begley, C. Glenn ;
Ellis, Lee M. .
NATURE, 2012, 483 (7391) :531-533
[7]  
Bollen K, 2015, NATL SCI FOUND, P3
[8]   The mPower study, Parkinson disease mobile data collected using ResearchKit [J].
Bot, Brian M. ;
Suver, Christine ;
Neto, Elias Chaibub ;
Kellen, Michael ;
Klein, Arno ;
Bare, Christopher ;
Doerr, Megan ;
Pratap, Abhishek ;
Wilbanks, John ;
Dorsey, E. Ray ;
Friend, Stephen H. ;
Trister, Andrew D. .
SCIENTIFIC DATA, 2016, 3
[9]   Machine learning: supervised methods [J].
Bzdok, Danilo ;
Krzywinski, Martin ;
Altman, Naomi .
NATURE METHODS, 2018, 15 (01) :5-6
[10]   A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models [J].
Christodoulou, Evangelia ;
Ma, Jie ;
Collins, Gary S. ;
Steyerberg, Ewout W. ;
Verbakel, Jan Y. ;
Van Calster, Ben .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 2019, 110 :12-22