Stroke Management and Analysis Risk Tool (SMART): An interpretable clinical application for diabetes-related stroke prediction

被引:0
作者
Sun, Yumeng [1 ,3 ]
Li, Jiaxi [1 ]
He, Haiyang [2 ]
Xing, Gaochang [3 ]
Liu, Zixuan [2 ]
Meng, Qingpeng [3 ]
Xu, Mingjun [2 ]
Huang, Letian [3 ]
Pan, Zhe [3 ]
Liao, Jun [2 ]
Ji, Cheng [4 ]
机构
[1] China Pharmaceut Univ, Nanjing Drum Tower Hosp, Dept Pharm, 639 Longmian Ave, Nanjing 211198, Jiangsu, Peoples R China
[2] China Pharmaceut Univ, Sch Sci, 639 Longmian Ave, Nanjing 211198, Jiangsu, Peoples R China
[3] China Pharmaceut Univ, Sch Basic Med & Clin Pharm, 639 Longmian Ave, Nanjing 211198, Jiangsu, Peoples R China
[4] Nanjing Univ, Nanjing Drum Tower Hosp, Affiliated Hosp, Med Sch,Dept Pharm, 321 Zhongshan Rd, Nanjing, Peoples R China
关键词
Type 2 diabetes mellitus; Stroke; Machine learning; Predictive models; Interpretability; BLOOD-PRESSURE; INFARCTION; DISEASE; SMOTE;
D O I
10.1016/j.numecd.2024.103841
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Background and aims: The growing global burden of diabetes and stroke poses a significant public health challenge. This study aims to analyze factors and create an interpretable stroke prediction model for diabetic patients. Methods and results: Data from 20,014 patients were collected from the Affiliated Drum Tower Hospital, Medical School of Nanjing University, between 2021 and 2022. After handling the missing values, feature engineering included LASSO, SVM-RFE, and multi-factor regression techniques. The dataset was split 8:2 for training and testing, with the Synthetic Minority Oversampling Technique (SMOTE) to balance classes. Various machine learning and deep learning techniques, such as Random Forest (RF) and deep neural networks (DNN), have been utilized for model training. SHAP and a dedicated website showed the interpretability and practicality of the model. This study identified 11 factors influencing stroke incidence, with the RF and DNN algorithms achieving AUC values of 0.95 and 0.91, respectively. The Stroke Management and Analysis Risk Tool (SMART) was developed for clinical use. Primary endpoint: The predictive performance of SMART in assessing stroke risk in diabetic patients was evaluated using AUC. Secondary endpoints: Evaluated accuracy (precision, recall, F1-score), interpretability via SHAP values, and clinical utility, emphasizing user interface. Statistical analysis of EHR data using univariate and multivariate methods, with model validation on a separate test set. Conclusions: An interpretable stroke-predictive model was created for patients with diabetes. This model proposes that standard clinical and laboratory parameters can predict the stroke risk in individuals with diabetes.
引用
收藏
页数:9
相关论文
共 39 条
[2]   SMOTE for high-dimensional class-imbalanced data [J].
Blagus, Rok ;
Lusa, Lara .
BMC BIOINFORMATICS, 2013, 14
[3]  
Brigl B, 2006, METHOD INFORM MED, V45, P81
[4]  
Chakrobartty S, 2021, DIGITAL INNOVATION AND ENTREPRENEURSHIP (AMCIS 2021)
[5]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[6]   Adults With Late Stage 3 Chronic Kidney Disease Are at High Risk for Prevalent Silent Brain Infarction A Population-Based Study [J].
Chou, Chia-Chi ;
Lien, Li-Ming ;
Chen, Wei-Hung ;
Wu, Mai-Szu ;
Lin, Shiue-Ming ;
Chiu, Hou-Chang ;
Chiou, Hung-Yi ;
Bai, Chyi-Huey .
STROKE, 2011, 42 (08) :2120-2125
[7]   Near-Bayesian Support Vector Machines for imbalanced data classification with equal or unequal misclassification costs [J].
Datta, Shounak ;
Das, Swagatam .
NEURAL NETWORKS, 2015, 70 :39-52
[8]   Improving the Classification Quality of the SVM Classifier for the Imbalanced Datasets on the Base of Ideas the SMOTE Algorithm [J].
Demidova, Liliya ;
Klyueva, Irina .
2017 SEMINAR ON SYSTEMS ANALYSIS, 2017, 10
[9]   Machine Learning in Medicine [J].
Deo, Rahul C. .
CIRCULATION, 2015, 132 (20) :1920-1930
[10]   Geometric SMOTE a geometrically enhanced drop-in replacement for SMOTE [J].
Douzas, Georgios ;
Bacao, Fernando .
INFORMATION SCIENCES, 2019, 501 :118-135