Evaluating prediction model performance

被引:21
作者
Cabot, John H. [1 ]
Ross, Elsie Gyang [1 ,2 ,3 ]
机构
[1] Stanford Univ, Dept Surg, Div Vasc Surg, Sch Med, Stanford, CA 94305 USA
[2] Stanford Univ, Sch Med, Ctr Biomed Informat Res, Stanford, CA USA
[3] Div Vasc Surg, 780 Welch Rd, Palo Alto, CA 94304 USA
关键词
D O I
10.1016/j.surg.2023.05.023
中图分类号
R61 [外科手术学];
学科分类号
摘要
This article highlights important performance metrics to consider when evaluating models developed for supervised classification or regression tasks using clinical data. When evaluating model performance, we detail the basics of confusion matrices, receiver operating characteristic curves, F1 scores, precision-recall curves, mean squared error, and other considerations. In this era, defined by the rapid proliferation of advanced prediction models, familiarity with various performance metrics beyond the area under the receiver operating characteristic curves and the nuances of evaluating model value upon implementation is essential to ensure effective resource allocation and optimal patient care delivery. & COPY; 2023 Elsevier Inc. All rights reserved.
引用
收藏
页码:723 / 726
页数:4
相关论文
共 10 条
[1]  
[Anonymous], 2010, Encyclopedia of Machine Learning, DOI [DOI 10.1007/978-0-387-30164-8, 10.1007/978-0-387-30164-8157]
[2]  
Boyd Kendrick, 2013, Machine Learning and Knowledge Discovery in Databases. European Conference, ECML PKDD 2013. Proceedings: LNCS 8190, P451, DOI 10.1007/978-3-642-40994-3_29
[3]   Evaluating algorithmic fairness in the presence of clinical guidelines: the case of atherosclerotic cardiovascular disease risk estimation [J].
Foryciarz, Agata ;
Pfohl, Stephen R. ;
Patel, Birju ;
Shah, Nigam .
BMJ HEALTH & CARE INFORMATICS, 2022, 29 (01)
[4]   Calibrating predictive model estimates in a distributed network of patient data [J].
Huang, Yingxiang ;
Jiang, Xiaoqian ;
Gabriel, Rodney A. ;
Ohno-Machado, Lucila .
JOURNAL OF BIOMEDICAL INFORMATICS, 2021, 117
[5]  
Kuhn M., 2013, Applied Predictive Modeling
[6]   Limitations of receiver operating characteristic curve on imbalanced data: Assist device mortality risk scores [J].
Movahedi, Faezeh ;
Padman, Rema ;
Antaki, James F. .
JOURNAL OF THORACIC AND CARDIOVASCULAR SURGERY, 2023, 165 (04) :1433-+
[7]  
Powers D, 2008, MACH LEARN TECHNOL, V2, P37
[8]   Ensuring Fairness in Machine Learning to Advance Health Equity [J].
Rajkomar, Alvin ;
Hardt, Michaela ;
Howell, Michael D. ;
Corrado, Greg ;
Chin, Marshall H. .
ANNALS OF INTERNAL MEDICINE, 2018, 169 (12) :866-+
[9]   Calibration: the Achilles heel of predictive analytics [J].
van Calster, Ben ;
McLernon, David J. ;
van Smeden, Maarten ;
Wynants, Laure ;
Steyerberg, Ewout W. .
BMC MEDICINE, 2019, 17 (01)
[10]   Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests [J].
Vickers, Andrew J. ;
Van Calster, Ben ;
Steyerberg, Ewout W. .
BMJ-BRITISH MEDICAL JOURNAL, 2016, 352