Causative Classification of Ischemic Stroke by the Machine Learning Algorithm Random Forests

被引：11

作者：

Wang, Jianan ^{[1
]}

Gong, Xiaoxian ^{[1
]}

Chen, Hongfang ^{[2
]}

Zhong, Wansi ^{[1
]}

Chen, Yi ^{[1
]}

Zhou, Ying ^{[1
]}

Zhang, Wenhua ^{[1
]}

He, Yaode ^{[1
]}

Lou, Min ^{[1
]}

机构：

[1] Zhejiang Univ, Sch Med, Dept Neurol, Affiliated Hosp 2, Hangzhou, Peoples R China

[2] Zhejiang Univ, Jinhua Municipal Cent Hosp, Dept Neurol, Jinhua Hosp, Jinhua, Zhejiang, Peoples R China

来源：

FRONTIERS IN AGING NEUROSCIENCE | 2022年 / 14卷

基金：

中国国家自然科学基金;

关键词：

machine learning; cardioembolism; large-artery atherosclerosis; small-artery occlusion; stroke; SUBTYPE; RECURRENCE; SYSTEMS; TOAST;

D O I：

10.3389/fnagi.2022.788637

中图分类号：

R592 [老年病学]; C [社会科学总论];

学科分类号：

03 ; 0303 ; 100203 ;

摘要：

Background: Prognosis, recurrence rate, and secondary prevention strategies differ by different etiologies in acute ischemic stroke. However, identifying its cause is challenging. Objective: This study aimed to develop a model to identify the cause of stroke using machine learning (ML) methods and test its accuracy. Methods: We retrospectively reviewed the data of patients who had determined etiology defined by the Trial of ORG 10172 in Acute Stroke Treatment (TOAST) from CASE-II (NCT04487340) to train and evaluate six ML models, namely, Random Forests (RF), Logistic Regression (LR), Extreme Gradient Boosting (XGBoost), K-Nearest Neighbor (KNN), Ada Boosting, Gradient Boosting Machine (GBM), for the detection of cardioembolism (CE), large-artery atherosclerosis (LAA), and small-artery occlusion (SAO). Between October 2016 and April 2020, patients were enrolled consecutively for algorithm development (phase one). Between June 2020 and December 2020, patients were enrolled consecutively in a test set for algorithm test (phase two). Area under the curve (AUC), precision, recall, accuracy, and F1 score were calculated for the prediction model. Results: Finally, a total of 18,209 patients were enrolled in phase one, including 13,590 patients (i.e., 6,089 CE, 4,539 LAA, and 2,962 SAO) in the model, and a total of 3,688 patients were enrolled in phase two, including 3,070 patients (i.e., 1,103 CE, 1,269 LAA, and 698 SAO) in the model. Among the six models, the best models were RF, XGBoost, and GBM, and we chose the RF model as our final model. Based on the test set, the AUC values of the RF model to predict CE, LAA, and SAO were 0.981 (95%CI, 0.978-0.986), 0.919 (95%CI, 0.911-0.928), and 0.918 (95%CI, 0.908-0.927), respectively. The most important items to identify CE, LAA, and SAO were atrial fibrillation and degree of stenosis of intracranial arteries. Conclusion: The proposed RF model could be a useful diagnostic tool to help neurologists categorize etiologies of stroke.

引用

页数：11

共 50 条

[21] Detection of acute ischemic stroke and backtracking stroke onset time via machine learning analysis of metabolomics
Zhang, Yiheng
Zhu, Dayu
Li, Tao
Wang, Xiaoya
Zhao, Lili
Yang, Xiaofei
Dang, Meijuan
Li, Ye
Wu, Yulun
Lu, Ziwei
Lu, Jialiang
Jian, Yating
Wang, Heying
Zhang, Lei
Lu, Xiaoyun
Shen, Ziyu
Fan, Hong
Cai, Wenshan
Zhang, Guilian
BIOMEDICINE & PHARMACOTHERAPY, 2022, 155
[22] Crytojacking Classification based on Machine Learning Algorithm
Mansor, Wan Nur Aaisyah Binti Wan
Ahmad, Azuan
Zainudin, Wan Shafiuddin
Saudi, Madihah Mohd
Kama, Mohd Nazri
ICCBN 2020: 2020 8TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND BROADBAND NETWORKING / ICCET 2020: 2020 3RD INTERNATIONAL CONFERENCE ON COMMUNICATION ENGINEERING AND TECHNOLOGY, 2020, : 73 - 76
[23] Machine Learning Algorithm in Network Traffic Classification
Rachmawati, Syifa Maliah
Kim, Dong-Seong
Lee, Jae-Min
12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 1010 - 1013
[24] Early and intermediate prognosis of intravenous thrombolytic therapy in acute ischemic stroke subtypes according to the causative classification of stroke system
Pashapour, Ali
Atalu, Abolfazl
Farhoudi, Mehdi
Taheraghdam, Ali-Akbar
Hokmabadi, Elyar Sadeghi
Sharifipour, Ehsan
NajafiNeshli, Mehdi
PAKISTAN JOURNAL OF MEDICAL SCIENCES, 2013, 29 (01) : 181 - 186
[25] Oxides Classification with Random Forests
Xiao, Kai
Chen, Baitong
Bao, Wenzheng
Cheng, Honglin
INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2022, PT II, 2022, 13394 : 680 - 686
[26] Random forests for classification in ecology
Cutler, D. Richard
Edwards, Thomas C., Jr.
Beard, Karen H.
Cutler, Adele
Hess, Kyle T.
ECOLOGY, 2007, 88 (11) : 2783 - 2792
[27] Interpretable Machine Learning Modeling for Ischemic Stroke Outcome Prediction
Jabal, Mohamed Sobhi
Joly, Olivier
Kallmes, David
Harston, George
Rabinstein, Alejandro
Huynh, Thien
Brinjikji, Waleed
FRONTIERS IN NEUROLOGY, 2022, 13
[28] Comparison of ischemic stroke diagnosis models based on machine learning
Yang, Wan-Xia
Wang, Fang-Fang
Pan, Yun-Yan
Xie, Jian-Qin
Lu, Ming-Hua
You, Chong-Ge
FRONTIERS IN NEUROLOGY, 2022, 13
[29] A machine learning model for visualization and dynamic clinical prediction of stroke recurrence in acute ischemic stroke patients: A real-world retrospective study
Wang, Kai
Shi, Qianqian
Sun, Chao
Liu, Wencai
Yau, Vicky
Xu, Chan
Liu, Haiyan
Sun, Chenyu
Yin, Chengliang
Wei, Xiu'e
Li, Wenle
Rong, Liangqun
FRONTIERS IN NEUROSCIENCE, 2023, 17
[30] Random Forests with Economic Roots: Explaining Machine Learning in Hedonic Imputation
Zeng, Shipei
Rao, Deyu
COMPUTATIONAL ECONOMICS, 2024,

← 1 2 3 4 5 →