Development and validation of explainable machine-learning models for carotid atherosclerosis early screening

被引：5

作者：

Yun, Ke ^{[1
,2
]}

He, Tao ^{[3
]}

Zhen, Shi ^{[4
]}

Quan, Meihui ^{[1
,2
]}

Yang, Xiaotao ^{[1
,2
]}

Man, Dongliang ^{[1
,2
]}

Zhang, Shuang ^{[1
,2
]}

Wang, Wei ^{[5
]}

Han, Xiaoxu ^{[1
,2
,6
,7
]}

机构：

[1] China Med Univ, Affiliated Hosp 1, Natl Clin Res Ctr Lab Med, Shenyang, Liaoning, Peoples R China

[2] China Med Univ, Affiliated Hosp 1, Dept Lab Med, Shenyang, Liaoning, Peoples R China

[3] Neusoft Corp, Neusoft Res Inst, Shenyang, Liaoning, Peoples R China

[4] Northeastern Univ, Dept Software Engn, Shenyang, Liaoning, Peoples R China

[5] China Med Univ, Affiliated Hosp 1, Dept Phys Examinat Ctr, Shenyang, Liaoning, Peoples R China

[6] Chinese Acad Med Sci, Lab Med Innovat Unit, Shenyang, Liaoning, Peoples R China

[7] China Med Univ, Affiliated Hosp 1, NHC Key Lab AIDS Immunol, Shenyang, Liaoning, Peoples R China

来源：

JOURNAL OF TRANSLATIONAL MEDICINE | 2023年 / 21卷 / 01期

关键词：

Machine learning; Carotid atherosclerosis; Explainable model; CHINESE ADULTS; RISK-FACTORS; PREVALENCE; ULTRASOUND; BURDEN; AGE; GENDER;

D O I：

10.1186/s12967-023-04093-8

中图分类号：

R-3 [医学研究方法]; R3 [基础医学];

学科分类号：

1001 ;

摘要：

BackgroundCarotid atherosclerosis (CAS), an important factor in the development of stroke, is a major public health concern. The aim of this study was to establish and validate machine learning (ML) models for early screening of CAS using routine health check-up indicators in northeast China.MethodsA total of 69,601 health check-up records from the health examination center of the First Hospital of China Medical University (Shenyang, China) were collected between 2018 and 2019. For the 2019 records, 80% were assigned to the training set and 20% to the testing set. The 2018 records were used as the external validation dataset. Ten ML algorithms, including decision tree (DT), K-nearest neighbors (KNN), logistic regression (LR), naive Bayes (NB), random forest (RF), multiplayer perceptron (MLP), extreme gradient boosting machine (XGB), gradient boosting decision tree (GBDT), linear support vector machine (SVM-linear), and non-linear support vector machine (SVM-nonlinear), were used to construct CAS screening models. The area under the receiver operating characteristic curve (auROC) and precision-recall curve (auPR) were used as measures of model performance. The SHapley Additive exPlanations (SHAP) method was used to demonstrate the interpretability of the optimal model.ResultsA total of 6315 records of patients undergoing carotid ultrasonography were collected; of these, 1632, 407, and 1141 patients were diagnosed with CAS in the training, internal validation, and external validation datasets, respectively. The GBDT model achieved the highest performance metrics with auROC of 0.860 (95% CI 0.839-0.880) in the internal validation dataset and 0.851 (95% CI 0.837-0.863) in the external validation dataset. Individuals with diabetes or those over 65 years of age showed low negative predictive value. In the interpretability analysis, age was the most important factor influencing the performance of the GBDT model, followed by sex and non-high-density lipoprotein cholesterol.ConclusionsThe ML models developed could provide good performance for CAS identification using routine health check-up indicators and could hopefully be applied in scenarios without ethnic and geographic heterogeneity for CAS prevention.

引用

页数：13

共 50 条

[21] Development and validation of explainable machine learning models for risk of mortality in transcatheter aortic valve implantation: TAVI risk machine scores
Leha, Andreas
Huber, Cynthia
Friede, Tim
Bauer, Timm
Beckmann, Andreas
Bekeredjian, Raffi
Bleiziffer, Sabine
Herrmann, Eva
Moellmann, Helge
Walther, Thomas
Beyersdorf, Friedhelm
Hamm, Christian
Kunzi, Arnaud
Windecker, Stephan
Stortecky, Stefan
Kutschka, Ingo
Hasenfuss, Gerd
Ensminger, Stephan
Frerker, Christian
Seidler, Tim
EUROPEAN HEART JOURNAL - DIGITAL HEALTH, 2023, 4 (03): : 225 - 235
[22] Development and validation of a carotid atherosclerosis risk prediction model based on a Chinese population
Huang, Guoqing
Jin, Qiankai
Tian, Xiaoqing
Mao, Yushan
FRONTIERS IN CARDIOVASCULAR MEDICINE, 2022, 9
[23] Machine-learning models to predict myopia in children and adolescents
Mu, Jingfeng
Zhong, Haoxi
Jiang, Mingjie
FRONTIERS IN MEDICINE, 2024, 11
[24] Development and validation of machine learning models for nonalcoholic fatty liver disease
Peng, Hong-Ye
Duan, Shao-Jie
Pan, Liang
Wang, Mi-Yuan
Chen, Jia-Liang
Wang, Yi-Chong
Yao, Shu-Kun
HEPATOBILIARY & PANCREATIC DISEASES INTERNATIONAL, 2023, 22 (06) : 615 - 621
[25] Development and internal validation of machine-learning models for predicting survival in patients who underwent surgery for spinal metastases
Santipas, Borriwat
Veerakanjana, Kanyakorn
Ittichaiwong, Piyalitt
Chavalparit, Piya
Wilartratsami, Sirichai
Luksanapruksa, Panya
ASIAN SPINE JOURNAL, 2024, 18 (03) : 325 - 335
[26] Development and validation of machine learning models to predict frailty risk for elderly
Zhang, Wei
Wang, Junchao
Xie, Fang
Wang, Xinghui
Dong, Shanshan
Luo, Nan
Li, Feng
Li, Yuewei
JOURNAL OF ADVANCED NURSING, 2024, 80 (12) : 5064 - 5075
[27] Implementing Explainable Machine Learning Models for Practical Prediction of Early Neonatal Hypoglycemia
Wang, Lin-Yu
Wang, Lin-Yen
Sung, Mei-, I
Lin, I-Chun
Liu, Chung-Feng
Chen, Chia-Jung
DIAGNOSTICS, 2024, 14 (14)
[28] The prediction of asymptomatic carotid atherosclerosis with electronic health records: a comparative study of six machine learning models
Fan, Jiaxin
Chen, Mengying
Luo, Jian
Yang, Shusen
Shi, Jinming
Yao, Qingling
Zhang, Xiaodong
Du, Shuang
Qu, Huiyang
Cheng, Yuxuan
Ma, Shuyin
Zhang, Meijuan
Xu, Xi
Wang, Qian
Zhan, Shuqin
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2021, 21 (01)
[29] Development and validation of interpretable machine learning models for postoperative pneumonia prediction
Xiang, Bingbing
Liu, Yiran
Jiao, Shulan
Zhang, Wensheng
Wang, Shun
Yi, Mingliang
FRONTIERS IN PUBLIC HEALTH, 2024, 12
[30] Explainable inflation forecasts by machine learning models
Aras, Serkan
Lisboa, Paulo J. G.
EXPERT SYSTEMS WITH APPLICATIONS, 2022, 207

← 1 2 3 4 5 →