Development and validation of explainable machine-learning models for carotid atherosclerosis early screening

被引:5
|
作者
Yun, Ke [1 ,2 ]
He, Tao [3 ]
Zhen, Shi [4 ]
Quan, Meihui [1 ,2 ]
Yang, Xiaotao [1 ,2 ]
Man, Dongliang [1 ,2 ]
Zhang, Shuang [1 ,2 ]
Wang, Wei [5 ]
Han, Xiaoxu [1 ,2 ,6 ,7 ]
机构
[1] China Med Univ, Affiliated Hosp 1, Natl Clin Res Ctr Lab Med, Shenyang, Liaoning, Peoples R China
[2] China Med Univ, Affiliated Hosp 1, Dept Lab Med, Shenyang, Liaoning, Peoples R China
[3] Neusoft Corp, Neusoft Res Inst, Shenyang, Liaoning, Peoples R China
[4] Northeastern Univ, Dept Software Engn, Shenyang, Liaoning, Peoples R China
[5] China Med Univ, Affiliated Hosp 1, Dept Phys Examinat Ctr, Shenyang, Liaoning, Peoples R China
[6] Chinese Acad Med Sci, Lab Med Innovat Unit, Shenyang, Liaoning, Peoples R China
[7] China Med Univ, Affiliated Hosp 1, NHC Key Lab AIDS Immunol, Shenyang, Liaoning, Peoples R China
关键词
Machine learning; Carotid atherosclerosis; Explainable model; CHINESE ADULTS; RISK-FACTORS; PREVALENCE; ULTRASOUND; BURDEN; AGE; GENDER;
D O I
10.1186/s12967-023-04093-8
中图分类号
R-3 [医学研究方法]; R3 [基础医学];
学科分类号
1001 ;
摘要
BackgroundCarotid atherosclerosis (CAS), an important factor in the development of stroke, is a major public health concern. The aim of this study was to establish and validate machine learning (ML) models for early screening of CAS using routine health check-up indicators in northeast China.MethodsA total of 69,601 health check-up records from the health examination center of the First Hospital of China Medical University (Shenyang, China) were collected between 2018 and 2019. For the 2019 records, 80% were assigned to the training set and 20% to the testing set. The 2018 records were used as the external validation dataset. Ten ML algorithms, including decision tree (DT), K-nearest neighbors (KNN), logistic regression (LR), naive Bayes (NB), random forest (RF), multiplayer perceptron (MLP), extreme gradient boosting machine (XGB), gradient boosting decision tree (GBDT), linear support vector machine (SVM-linear), and non-linear support vector machine (SVM-nonlinear), were used to construct CAS screening models. The area under the receiver operating characteristic curve (auROC) and precision-recall curve (auPR) were used as measures of model performance. The SHapley Additive exPlanations (SHAP) method was used to demonstrate the interpretability of the optimal model.ResultsA total of 6315 records of patients undergoing carotid ultrasonography were collected; of these, 1632, 407, and 1141 patients were diagnosed with CAS in the training, internal validation, and external validation datasets, respectively. The GBDT model achieved the highest performance metrics with auROC of 0.860 (95% CI 0.839-0.880) in the internal validation dataset and 0.851 (95% CI 0.837-0.863) in the external validation dataset. Individuals with diabetes or those over 65 years of age showed low negative predictive value. In the interpretability analysis, age was the most important factor influencing the performance of the GBDT model, followed by sex and non-high-density lipoprotein cholesterol.ConclusionsThe ML models developed could provide good performance for CAS identification using routine health check-up indicators and could hopefully be applied in scenarios without ethnic and geographic heterogeneity for CAS prevention.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Development and validation of explainable machine learning models for risk of mortality in transcatheter aortic valve implantation: TAVI risk machine scores
    Leha, Andreas
    Huber, Cynthia
    Friede, Tim
    Bauer, Timm
    Beckmann, Andreas
    Bekeredjian, Raffi
    Bleiziffer, Sabine
    Herrmann, Eva
    Moellmann, Helge
    Walther, Thomas
    Beyersdorf, Friedhelm
    Hamm, Christian
    Kunzi, Arnaud
    Windecker, Stephan
    Stortecky, Stefan
    Kutschka, Ingo
    Hasenfuss, Gerd
    Ensminger, Stephan
    Frerker, Christian
    Seidler, Tim
    EUROPEAN HEART JOURNAL - DIGITAL HEALTH, 2023, 4 (03): : 225 - 235
  • [22] Development and validation of a carotid atherosclerosis risk prediction model based on a Chinese population
    Huang, Guoqing
    Jin, Qiankai
    Tian, Xiaoqing
    Mao, Yushan
    FRONTIERS IN CARDIOVASCULAR MEDICINE, 2022, 9
  • [23] Machine-learning models to predict myopia in children and adolescents
    Mu, Jingfeng
    Zhong, Haoxi
    Jiang, Mingjie
    FRONTIERS IN MEDICINE, 2024, 11
  • [24] Development and validation of machine learning models for nonalcoholic fatty liver disease
    Peng, Hong-Ye
    Duan, Shao-Jie
    Pan, Liang
    Wang, Mi-Yuan
    Chen, Jia-Liang
    Wang, Yi-Chong
    Yao, Shu-Kun
    HEPATOBILIARY & PANCREATIC DISEASES INTERNATIONAL, 2023, 22 (06) : 615 - 621
  • [25] Development and internal validation of machine-learning models for predicting survival in patients who underwent surgery for spinal metastases
    Santipas, Borriwat
    Veerakanjana, Kanyakorn
    Ittichaiwong, Piyalitt
    Chavalparit, Piya
    Wilartratsami, Sirichai
    Luksanapruksa, Panya
    ASIAN SPINE JOURNAL, 2024, 18 (03) : 325 - 335
  • [26] Development and validation of machine learning models to predict frailty risk for elderly
    Zhang, Wei
    Wang, Junchao
    Xie, Fang
    Wang, Xinghui
    Dong, Shanshan
    Luo, Nan
    Li, Feng
    Li, Yuewei
    JOURNAL OF ADVANCED NURSING, 2024, 80 (12) : 5064 - 5075
  • [27] Implementing Explainable Machine Learning Models for Practical Prediction of Early Neonatal Hypoglycemia
    Wang, Lin-Yu
    Wang, Lin-Yen
    Sung, Mei-, I
    Lin, I-Chun
    Liu, Chung-Feng
    Chen, Chia-Jung
    DIAGNOSTICS, 2024, 14 (14)
  • [28] The prediction of asymptomatic carotid atherosclerosis with electronic health records: a comparative study of six machine learning models
    Fan, Jiaxin
    Chen, Mengying
    Luo, Jian
    Yang, Shusen
    Shi, Jinming
    Yao, Qingling
    Zhang, Xiaodong
    Du, Shuang
    Qu, Huiyang
    Cheng, Yuxuan
    Ma, Shuyin
    Zhang, Meijuan
    Xu, Xi
    Wang, Qian
    Zhan, Shuqin
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2021, 21 (01)
  • [29] Development and validation of interpretable machine learning models for postoperative pneumonia prediction
    Xiang, Bingbing
    Liu, Yiran
    Jiao, Shulan
    Zhang, Wensheng
    Wang, Shun
    Yi, Mingliang
    FRONTIERS IN PUBLIC HEALTH, 2024, 12
  • [30] Explainable inflation forecasts by machine learning models
    Aras, Serkan
    Lisboa, Paulo J. G.
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 207