Development and validation of explainable machine-learning models for carotid atherosclerosis early screening

被引：5

作者：

Yun, Ke ^{[1
,2
]}

He, Tao ^{[3
]}

Zhen, Shi ^{[4
]}

Quan, Meihui ^{[1
,2
]}

Yang, Xiaotao ^{[1
,2
]}

Man, Dongliang ^{[1
,2
]}

Zhang, Shuang ^{[1
,2
]}

Wang, Wei ^{[5
]}

Han, Xiaoxu ^{[1
,2
,6
,7
]}

机构：

[1] China Med Univ, Affiliated Hosp 1, Natl Clin Res Ctr Lab Med, Shenyang, Liaoning, Peoples R China

[2] China Med Univ, Affiliated Hosp 1, Dept Lab Med, Shenyang, Liaoning, Peoples R China

[3] Neusoft Corp, Neusoft Res Inst, Shenyang, Liaoning, Peoples R China

[4] Northeastern Univ, Dept Software Engn, Shenyang, Liaoning, Peoples R China

[5] China Med Univ, Affiliated Hosp 1, Dept Phys Examinat Ctr, Shenyang, Liaoning, Peoples R China

[6] Chinese Acad Med Sci, Lab Med Innovat Unit, Shenyang, Liaoning, Peoples R China

[7] China Med Univ, Affiliated Hosp 1, NHC Key Lab AIDS Immunol, Shenyang, Liaoning, Peoples R China

来源：

JOURNAL OF TRANSLATIONAL MEDICINE | 2023年 / 21卷 / 01期

关键词：

Machine learning; Carotid atherosclerosis; Explainable model; CHINESE ADULTS; RISK-FACTORS; PREVALENCE; ULTRASOUND; BURDEN; AGE; GENDER;

D O I：

10.1186/s12967-023-04093-8

中图分类号：

R-3 [医学研究方法]; R3 [基础医学];

学科分类号：

1001 ;

摘要：

BackgroundCarotid atherosclerosis (CAS), an important factor in the development of stroke, is a major public health concern. The aim of this study was to establish and validate machine learning (ML) models for early screening of CAS using routine health check-up indicators in northeast China.MethodsA total of 69,601 health check-up records from the health examination center of the First Hospital of China Medical University (Shenyang, China) were collected between 2018 and 2019. For the 2019 records, 80% were assigned to the training set and 20% to the testing set. The 2018 records were used as the external validation dataset. Ten ML algorithms, including decision tree (DT), K-nearest neighbors (KNN), logistic regression (LR), naive Bayes (NB), random forest (RF), multiplayer perceptron (MLP), extreme gradient boosting machine (XGB), gradient boosting decision tree (GBDT), linear support vector machine (SVM-linear), and non-linear support vector machine (SVM-nonlinear), were used to construct CAS screening models. The area under the receiver operating characteristic curve (auROC) and precision-recall curve (auPR) were used as measures of model performance. The SHapley Additive exPlanations (SHAP) method was used to demonstrate the interpretability of the optimal model.ResultsA total of 6315 records of patients undergoing carotid ultrasonography were collected; of these, 1632, 407, and 1141 patients were diagnosed with CAS in the training, internal validation, and external validation datasets, respectively. The GBDT model achieved the highest performance metrics with auROC of 0.860 (95% CI 0.839-0.880) in the internal validation dataset and 0.851 (95% CI 0.837-0.863) in the external validation dataset. Individuals with diabetes or those over 65 years of age showed low negative predictive value. In the interpretability analysis, age was the most important factor influencing the performance of the GBDT model, followed by sex and non-high-density lipoprotein cholesterol.ConclusionsThe ML models developed could provide good performance for CAS identification using routine health check-up indicators and could hopefully be applied in scenarios without ethnic and geographic heterogeneity for CAS prevention.

引用

页数：13

共 50 条

[31] Evaluating Explainable Machine Learning Models for Clinicians
Scarpato, Noemi
Nourbakhsh, Aria
Ferroni, Patrizia
Riondino, Silvia
Roselli, Mario
Fallucchi, Francesca
Barbanti, Piero
Guadagni, Fiorella
Zanzotto, Fabio Massimo
COGNITIVE COMPUTATION, 2024, 16 (04) : 1436 - 1446
[32] Development and external validation of a machine learning model for cardiac valve calcification early screening in dialysis patients: a multicenter study
Wang, Xiaoxu
Li, Yinfang
Cao, Zixin
Li, Yunuo
Cao, Jingyuan
Wang, Yao
Li, Min
Zheng, Jing
Peng, Siqi
Shi, Wen
Wu, Qianqian
Yang, Junlan
Fang, Yaping
Zhang, Aiqing
Zhang, Xiaoliang
Wang, Bin
RENAL FAILURE, 2025, 47 (01)
[33] How can machine-learning methods assist in virtual screening for hyperuricemia? A healthcare machine-learning approach
Ichikawa, Daisuke
Saito, Toki
Ujita, Waka
Oyama, Hiroshi
JOURNAL OF BIOMEDICAL INFORMATICS, 2016, 64 : 20 - 24
[34] Development and validation of three machine-learning models for predicting multiple organ failure in moderately severe and severe acute pancreatitis
Qiu, Qiu
Nian, Yong-jian
Guo, Yan
Tang, Liang
Lu, Nan
Wen, Liang-zhi
Wang, Bin
Chen, Dong-feng
Liu, Kai-jun
BMC GASTROENTEROLOGY, 2019, 19 (1)
[35] Development and validation of three machine-learning models for predicting multiple organ failure in moderately severe and severe acute pancreatitis
Qiu Qiu
Yong-jian Nian
Yan Guo
Liang Tang
Nan Lu
Liang-zhi Wen
Bin Wang
Dong-feng Chen
Kai-jun Liu
BMC Gastroenterology, 19
[36] Exploring Primary and Interaction Effects of Minor Physical Anomalies: Development and Validation of Prediction Models Using Explainable Machine Learning Algorithms for Early-Onset Schizophrenia
Lin, Chih-Wei
Lin, Jin-Jia
Tseng, Huai-Hsuan
Jang, Fong-Lin
Lu, Ming-Kun
Chen, Po-See
Huang, Chih-Chun
Yao, Chi-Yu
Wang, Tzu-Yun
Chang, Wei-Hung
Tan, Hung-Pin
Lin, Sheng-Hsiang
SCHIZOPHRENIA BULLETIN, 2025,
[37] Development and validation of an ensemble machine-learning model for predicting early mortality among patients with bone metastases of hepatocellular carcinoma
Long, Ze
Yi, Min
Qin, Yong
Ye, Qianwen
Che, Xiaotong
Wang, Shengjie
Lei, Mingxing
FRONTIERS IN ONCOLOGY, 2023, 13
[38] Advancing interpretability of machine-learning prediction models
Trenary, Laurie
DelSole, Timothy
ENVIRONMENTAL DATA SCIENCE, 2022, 1
[39] Machine-learning models for combinatorial catalyst discovery
Landrum, GA
Penzotti, JE
Putta, S
MEASUREMENT SCIENCE AND TECHNOLOGY, 2005, 16 (01) : 270 - 277
[40] SAnDReS 2.0: Development of machine-learning models to explore the scoring function space
de Azevedo Jr, Walter Filgueira
Quiroga, Rodrigo
Villarreal, Marcos Ariel
da Silveira, Nelson Jose Freitas
Bitencourt-Ferreira, Gabriela
da Silva, Amauri Duarte
Veit-Acosta, Martina
Oliveira, Patricia Rufino
Tutone, Marco
Biziukova, Nadezhda
Poroikov, Vladimir
Tarasova, Olga
Baud, Stephaine
JOURNAL OF COMPUTATIONAL CHEMISTRY, 2024, 45 (27) : 2333 - 2346

← 1 2 3 4 5 →