Using machine learning to predict acute myocardial infarction and ischemic heart disease in primary care cardiovascular patients

被引:0
作者
Salet, N. [1 ]
Gokdemir, A. [1 ,2 ]
Preijde, J. [2 ]
van Heck, C. H. [3 ]
Eijkenaar, F. [1 ]
机构
[1] Erasmus Univ, Erasmus Sch Hlth Policy & Management, Rotterdam, Netherlands
[2] Esculine Bv, Capelle Aan Den Ijssel, South Holland, Netherlands
[3] DrechtDokters, Hendrik Ido Ambacht, South Holland, Netherlands
关键词
RISK; PERFORMANCE; VALIDATION; FRAMEWORK; EVENTS; ASTHMA;
D O I
10.1371/journal.pone.0307099
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background Early recognition, which preferably happens in primary care, is the most important tool to combat cardiovascular disease (CVD). This study aims to predict acute myocardial infarction (AMI) and ischemic heart disease (IHD) using Machine Learning (ML) in primary care cardiovascular patients. We compare the ML-models' performance with that of the common SMART algorithm and discuss clinical implications.Methods and results Patient-level medical record data (n = 13,218) collected between 2011-2021 from 90 GP-practices were used to construct two random forest models (one for AMI and one for IHD) as well as a linear model based on the SMART risk prediction algorithm as a suitable comparator. The data contained patient-level predictors, including demographics, procedures, medications, biometrics, and diagnosis. Temporal cross-validation was used to assess performance. Furthermore, predictors that contributed most to the ML-models' accuracy were identified. The ML-model predicting AMI had an accuracy of 0.97, a sensitivity of 0.67, a specificity of 1.00 and a precision of 0.99. The AUC was 0.96 and the Brier score was 0.03. The IHD-model had similar performance. In both ML-models anticoagulants/antiplatelet use, systolic blood pressure, mean blood glucose, and eGFR contributed most to model accuracy. For both outcomes, the SMART algorithm was substantially outperformed by ML on all metrics.Methods and results Patient-level medical record data (n = 13,218) collected between 2011-2021 from 90 GP-practices were used to construct two random forest models (one for AMI and one for IHD) as well as a linear model based on the SMART risk prediction algorithm as a suitable comparator. The data contained patient-level predictors, including demographics, procedures, medications, biometrics, and diagnosis. Temporal cross-validation was used to assess performance. Furthermore, predictors that contributed most to the ML-models' accuracy were identified. The ML-model predicting AMI had an accuracy of 0.97, a sensitivity of 0.67, a specificity of 1.00 and a precision of 0.99. The AUC was 0.96 and the Brier score was 0.03. The IHD-model had similar performance. In both ML-models anticoagulants/antiplatelet use, systolic blood pressure, mean blood glucose, and eGFR contributed most to model accuracy. For both outcomes, the SMART algorithm was substantially outperformed by ML on all metrics.Conclusion Our findings underline the potential of using ML for CVD prediction purposes in primary care, although the interpretation of predictors can be difficult. Clinicians, patients, and researchers might benefit from transitioning to using ML-models in support of individualized predictions by primary care physicians and subsequent (secondary) prevention.
引用
收藏
页数:17
相关论文
共 59 条
[1]   Identification of significant features and data mining techniques in predicting heart disease [J].
Amin, Mohammad Shafenoor ;
Chiam, Yin Kia ;
Varathan, Kasturi Dewi .
TELEMATICS AND INFORMATICS, 2019, 36 :82-93
[2]  
[Anonymous], WHO reveals leading causes of death and disability worldwide: 2000-2019. 2020
[3]  
[Anonymous], 2019, Cancer Causes Control, V30
[4]  
Bansilal S., 2015, IJCA, pS1
[5]   Effectiveness and cost effectiveness of cardiovascular disease prevention in whole populations: modelling study [J].
Barton, Pelham ;
Andronis, Lazaros ;
Briggs, Andrew ;
McPherson, Klim ;
Capewell, Simon .
BMJ-BRITISH MEDICAL JOURNAL, 2011, 343
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]   Artificial intelligence in the risk prediction models of cardiovascular disease and development of an independent validation screening tool: a systematic review [J].
Cai, Yue ;
Cai, Yu-Qing ;
Tang, Li-Ying ;
Wang, Yi-Han ;
Gong, Mengchun ;
Jing, Tian-Ci ;
Li, Hui-Jun ;
Li-Ling, Jesse ;
Hu, Wei ;
Yin, Zhihua ;
Gong, Da-Xin ;
Zhang, Guang-Wei .
BMC MEDICINE, 2024, 22 (01)
[8]   Random forest versus logistic regression: a large-scale benchmark experiment [J].
Couronne, Raphael ;
Probst, Philipp ;
Boulesteix, Anne-Laure .
BMC BIOINFORMATICS, 2018, 19
[9]   Prediction models for cardiovascular disease risk in the general population: systematic review [J].
Damen, Johanna A. A. G. ;
Hooft, Lotty ;
Schuit, Ewoud ;
Debray, Thomas P. A. ;
Collins, Gary S. ;
Tzoulaki, Ioanna ;
Lassale, Camille M. ;
Siontis, George C. M. ;
Chiocchia, Virginia ;
Roberts, Corran ;
Schlussel, Michael Maia ;
Gerry, Stephen ;
Black, James A. ;
Heus, Pauline ;
van der Schouw, Yvonne T. ;
Peelen, Linda M. ;
Moons, Karel G. M. .
BMJ-BRITISH MEDICAL JOURNAL, 2016, 353
[10]   Machine Learning in Medicine [J].
Deo, Rahul C. .
CIRCULATION, 2015, 132 (20) :1920-1930