Causative Classification of Ischemic Stroke by the Machine Learning Algorithm Random Forests

被引:11
|
作者
Wang, Jianan [1 ]
Gong, Xiaoxian [1 ]
Chen, Hongfang [2 ]
Zhong, Wansi [1 ]
Chen, Yi [1 ]
Zhou, Ying [1 ]
Zhang, Wenhua [1 ]
He, Yaode [1 ]
Lou, Min [1 ]
机构
[1] Zhejiang Univ, Sch Med, Dept Neurol, Affiliated Hosp 2, Hangzhou, Peoples R China
[2] Zhejiang Univ, Jinhua Municipal Cent Hosp, Dept Neurol, Jinhua Hosp, Jinhua, Zhejiang, Peoples R China
来源
FRONTIERS IN AGING NEUROSCIENCE | 2022年 / 14卷
基金
中国国家自然科学基金;
关键词
machine learning; cardioembolism; large-artery atherosclerosis; small-artery occlusion; stroke; SUBTYPE; RECURRENCE; SYSTEMS; TOAST;
D O I
10.3389/fnagi.2022.788637
中图分类号
R592 [老年病学]; C [社会科学总论];
学科分类号
03 ; 0303 ; 100203 ;
摘要
Background: Prognosis, recurrence rate, and secondary prevention strategies differ by different etiologies in acute ischemic stroke. However, identifying its cause is challenging. Objective: This study aimed to develop a model to identify the cause of stroke using machine learning (ML) methods and test its accuracy. Methods: We retrospectively reviewed the data of patients who had determined etiology defined by the Trial of ORG 10172 in Acute Stroke Treatment (TOAST) from CASE-II (NCT04487340) to train and evaluate six ML models, namely, Random Forests (RF), Logistic Regression (LR), Extreme Gradient Boosting (XGBoost), K-Nearest Neighbor (KNN), Ada Boosting, Gradient Boosting Machine (GBM), for the detection of cardioembolism (CE), large-artery atherosclerosis (LAA), and small-artery occlusion (SAO). Between October 2016 and April 2020, patients were enrolled consecutively for algorithm development (phase one). Between June 2020 and December 2020, patients were enrolled consecutively in a test set for algorithm test (phase two). Area under the curve (AUC), precision, recall, accuracy, and F1 score were calculated for the prediction model. Results: Finally, a total of 18,209 patients were enrolled in phase one, including 13,590 patients (i.e., 6,089 CE, 4,539 LAA, and 2,962 SAO) in the model, and a total of 3,688 patients were enrolled in phase two, including 3,070 patients (i.e., 1,103 CE, 1,269 LAA, and 698 SAO) in the model. Among the six models, the best models were RF, XGBoost, and GBM, and we chose the RF model as our final model. Based on the test set, the AUC values of the RF model to predict CE, LAA, and SAO were 0.981 (95%CI, 0.978-0.986), 0.919 (95%CI, 0.911-0.928), and 0.918 (95%CI, 0.908-0.927), respectively. The most important items to identify CE, LAA, and SAO were atrial fibrillation and degree of stenosis of intracranial arteries. Conclusion: The proposed RF model could be a useful diagnostic tool to help neurologists categorize etiologies of stroke.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Detection of acute ischemic stroke and backtracking stroke onset time via machine learning analysis of metabolomics
    Zhang, Yiheng
    Zhu, Dayu
    Li, Tao
    Wang, Xiaoya
    Zhao, Lili
    Yang, Xiaofei
    Dang, Meijuan
    Li, Ye
    Wu, Yulun
    Lu, Ziwei
    Lu, Jialiang
    Jian, Yating
    Wang, Heying
    Zhang, Lei
    Lu, Xiaoyun
    Shen, Ziyu
    Fan, Hong
    Cai, Wenshan
    Zhang, Guilian
    BIOMEDICINE & PHARMACOTHERAPY, 2022, 155
  • [22] Crytojacking Classification based on Machine Learning Algorithm
    Mansor, Wan Nur Aaisyah Binti Wan
    Ahmad, Azuan
    Zainudin, Wan Shafiuddin
    Saudi, Madihah Mohd
    Kama, Mohd Nazri
    ICCBN 2020: 2020 8TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND BROADBAND NETWORKING / ICCET 2020: 2020 3RD INTERNATIONAL CONFERENCE ON COMMUNICATION ENGINEERING AND TECHNOLOGY, 2020, : 73 - 76
  • [23] Machine Learning Algorithm in Network Traffic Classification
    Rachmawati, Syifa Maliah
    Kim, Dong-Seong
    Lee, Jae-Min
    12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 1010 - 1013
  • [24] Early and intermediate prognosis of intravenous thrombolytic therapy in acute ischemic stroke subtypes according to the causative classification of stroke system
    Pashapour, Ali
    Atalu, Abolfazl
    Farhoudi, Mehdi
    Taheraghdam, Ali-Akbar
    Hokmabadi, Elyar Sadeghi
    Sharifipour, Ehsan
    NajafiNeshli, Mehdi
    PAKISTAN JOURNAL OF MEDICAL SCIENCES, 2013, 29 (01) : 181 - 186
  • [25] Oxides Classification with Random Forests
    Xiao, Kai
    Chen, Baitong
    Bao, Wenzheng
    Cheng, Honglin
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2022, PT II, 2022, 13394 : 680 - 686
  • [26] Random forests for classification in ecology
    Cutler, D. Richard
    Edwards, Thomas C., Jr.
    Beard, Karen H.
    Cutler, Adele
    Hess, Kyle T.
    ECOLOGY, 2007, 88 (11) : 2783 - 2792
  • [27] Interpretable Machine Learning Modeling for Ischemic Stroke Outcome Prediction
    Jabal, Mohamed Sobhi
    Joly, Olivier
    Kallmes, David
    Harston, George
    Rabinstein, Alejandro
    Huynh, Thien
    Brinjikji, Waleed
    FRONTIERS IN NEUROLOGY, 2022, 13
  • [28] Comparison of ischemic stroke diagnosis models based on machine learning
    Yang, Wan-Xia
    Wang, Fang-Fang
    Pan, Yun-Yan
    Xie, Jian-Qin
    Lu, Ming-Hua
    You, Chong-Ge
    FRONTIERS IN NEUROLOGY, 2022, 13
  • [29] A machine learning model for visualization and dynamic clinical prediction of stroke recurrence in acute ischemic stroke patients: A real-world retrospective study
    Wang, Kai
    Shi, Qianqian
    Sun, Chao
    Liu, Wencai
    Yau, Vicky
    Xu, Chan
    Liu, Haiyan
    Sun, Chenyu
    Yin, Chengliang
    Wei, Xiu'e
    Li, Wenle
    Rong, Liangqun
    FRONTIERS IN NEUROSCIENCE, 2023, 17
  • [30] Random Forests with Economic Roots: Explaining Machine Learning in Hedonic Imputation
    Zeng, Shipei
    Rao, Deyu
    COMPUTATIONAL ECONOMICS, 2024,