Hematoma expansion prediction based on SMOTE and XGBoost algorithm

被引:5
作者
Li, Yan [1 ]
Du, Chaonan [2 ]
Ge, Sikai [1 ]
Zhang, Ruonan [1 ]
Shao, Yiming [1 ]
Chen, Keyu [1 ]
Li, Zhepeng [1 ]
Ma, Fei [1 ]
机构
[1] Xian Jiaotong Liverpool Univ, Dept Math & Phys, Suzhou, Peoples R China
[2] Nanjing Univ, Affiliated Jinling Hosp, Dept Neurosurg, Med Sch, Nanjing, Peoples R China
关键词
Hematoma expansion; XGBoost; SMOTE; Machine learning prediction; Unbalanced dataset; INTRACEREBRAL HEMORRHAGE; GROWTH; IDENTIFICATION; BRAIN; RISK; SIGN;
D O I
10.1186/s12911-024-02561-9
中图分类号
R-058 [];
学科分类号
摘要
Hematoma expansion (HE) is a high risky symptom with high rate of occurrence for patients who have undergone spontaneous intracerebral hemorrhage (ICH) after a major accident or illness. Correct prediction of the occurrence of HE in advance is critical to help the doctors to determine the next step medical treatment. Most existing studies focus only on the occurrence of HE within 6 h after the occurrence of ICH, while in reality a considerable number of patients have HE after the first 6 h but within 24 h. In this study, based on the medical doctors recommendation, we focus on prediction of the occurrence of HE within 24 h, as well as the occurrence of HE every 6 h within 24 h. Based on the demographics and computer tomography (CT) image extraction information, we used the XGBoost method to predict the occurrence of HE within 24 h. In this study, to solve the issue of highly imbalanced data set, which is a frequent case in medical data analysis, we used the SMOTE algorithm for data augmentation. To evaluate our method, we used a data set consisting of 582 patients records, and compared the results of proposed method as well as few machine learning methods. Our experiments show that XGBoost achieved the best prediction performance on the balanced dataset processed by the SMOTE algorithm with an accuracy of 0.82 and F1-score of 0.82. Moreover, our proposed method predicts the occurrence of HE within 6, 12, 18 and 24 h at the accuracy of 0.89, 0.82, 0.87 and 0.94, indicating that the HE occurrence within 24 h can be predicted accurately by the proposed method.
引用
收藏
页数:12
相关论文
共 60 条
[1]   Predicting diabetes mellitus using SMOTE and ensemble machine learning approach: The Henry Ford ExercIse Testing (FIT) project [J].
Alghamdi, Manal ;
Al-Mallah, Mouaz ;
Keteylan, Steven ;
Brawner, Clinton ;
Ehrman, Jonathan ;
Sakr, Sherif .
PLOS ONE, 2017, 12 (07)
[2]   Rapid Blood-Pressure Lowering in Patients with Acute Intracerebral Hemorrhage [J].
Anderson, Craig S. ;
Heeley, Emma ;
Huang, Yining ;
Wang, Jiguang ;
Stapf, Christian ;
Delcourt, Candice ;
Lindley, Richard ;
Robinson, Thompson ;
Lavados, Pablo ;
Neal, Bruce ;
Hata, Jun ;
Arima, Hisatomi ;
Parsons, Mark ;
Li, Yuechun ;
Wang, Jinchao ;
Heritier, Stephane ;
Li, Qiang ;
Woodward, Mark ;
Simes, R. John ;
Davis, Stephen M. ;
Chalmers, John .
NEW ENGLAND JOURNAL OF MEDICINE, 2013, 368 (25) :2355-2365
[3]   Effects of Early Intensive Blood Pressure-Lowering Treatment on the Growth of Hematoma and Perihematomal Edema in Acute Intracerebral Hemorrhage The Intensive Blood Pressure Reduction in Acute Cerebral Haemorrhage Trial (INTERACT) [J].
Anderson, Craig S. ;
Huang, Yining ;
Arima, Hisatomi ;
Heeley, Emma ;
Skulina, Christian ;
Parsons, Mark W. ;
Peng, Bin ;
Li, Qiang ;
Su, Steve ;
Tao, Qing Ling ;
Li, Yue Chun ;
Jiang, Jian Dong ;
Tai, Li Wen ;
Zhang, Jin Li ;
Xu, En ;
Cheng, Yan ;
Morgenstern, Lewis B. ;
Chalmers, John ;
Wang, Ji Guang .
STROKE, 2010, 41 (02) :307-312
[4]   Predicting Hematoma Expansion After Primary Intracerebral Hemorrhage [J].
Brouwers, H. Bart ;
Chang, Yuchiao ;
Falcone, Guido J. ;
Cai, Xuemei ;
Ayres, Alison M. ;
Battey, Thomas W. K. ;
Vashkevich, Anastasia ;
McNamara, Kristen A. ;
Valant, Valerie ;
Schwab, Kristin ;
Orzell, Susannah C. ;
Bresette, Linda M. ;
Feske, Steven K. ;
Rost, Natalia S. ;
Romero, Javier M. ;
Viswanathan, Anand ;
Chou, Sherry H. -Y. ;
Greenberg, Steven M. ;
Rosand, Jonathan ;
Goldstein, Joshua N. .
JAMA NEUROLOGY, 2014, 71 (02) :158-164
[5]   Prediction of intracerebral haemorrhage expansion with clinical, laboratory, pharmacologic, and noncontrast radiographic variables [J].
Chan, Sheila ;
Conell, Carol ;
Veerina, Kaivalya T. ;
Rao, Vivek A. ;
Flint, Alexander C. .
INTERNATIONAL JOURNAL OF STROKE, 2015, 10 (07) :1057-1061
[6]   A Machine-Learning-Based Prediction Method for Hypertension Outcomes Based on Medical Data [J].
Chang, Wenbing ;
Liu, Yinglai ;
Xiao, Yiyong ;
Yuan, Xinglong ;
Xu, Xingxing ;
Zhang, Siyue ;
Zhou, Shenghan .
DIAGNOSTICS, 2019, 9 (04)
[7]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[8]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[9]   Effective Intrusion Detection System Using XGBoost [J].
Dhaliwal, Sukhpreet Singh ;
Abdullah-Al Nahid ;
Abbas, Robert .
INFORMATION, 2018, 9 (07)
[10]   A data-driven approach to predicting diabetes and cardiovascular disease with machine learning [J].
Dinh, An ;
Miertschin, Stacey ;
Young, Amber ;
Mohanty, Somya D. .
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (01)