Development and validation of explainable machine-learning models for carotid atherosclerosis early screening

被引:5
作者
Yun, Ke [1 ,2 ]
He, Tao [3 ]
Zhen, Shi [4 ]
Quan, Meihui [1 ,2 ]
Yang, Xiaotao [1 ,2 ]
Man, Dongliang [1 ,2 ]
Zhang, Shuang [1 ,2 ]
Wang, Wei [5 ]
Han, Xiaoxu [1 ,2 ,6 ,7 ]
机构
[1] China Med Univ, Affiliated Hosp 1, Natl Clin Res Ctr Lab Med, Shenyang, Liaoning, Peoples R China
[2] China Med Univ, Affiliated Hosp 1, Dept Lab Med, Shenyang, Liaoning, Peoples R China
[3] Neusoft Corp, Neusoft Res Inst, Shenyang, Liaoning, Peoples R China
[4] Northeastern Univ, Dept Software Engn, Shenyang, Liaoning, Peoples R China
[5] China Med Univ, Affiliated Hosp 1, Dept Phys Examinat Ctr, Shenyang, Liaoning, Peoples R China
[6] Chinese Acad Med Sci, Lab Med Innovat Unit, Shenyang, Liaoning, Peoples R China
[7] China Med Univ, Affiliated Hosp 1, NHC Key Lab AIDS Immunol, Shenyang, Liaoning, Peoples R China
关键词
Machine learning; Carotid atherosclerosis; Explainable model; CHINESE ADULTS; RISK-FACTORS; PREVALENCE; ULTRASOUND; BURDEN; AGE; GENDER;
D O I
10.1186/s12967-023-04093-8
中图分类号
R-3 [医学研究方法]; R3 [基础医学];
学科分类号
1001 ;
摘要
BackgroundCarotid atherosclerosis (CAS), an important factor in the development of stroke, is a major public health concern. The aim of this study was to establish and validate machine learning (ML) models for early screening of CAS using routine health check-up indicators in northeast China.MethodsA total of 69,601 health check-up records from the health examination center of the First Hospital of China Medical University (Shenyang, China) were collected between 2018 and 2019. For the 2019 records, 80% were assigned to the training set and 20% to the testing set. The 2018 records were used as the external validation dataset. Ten ML algorithms, including decision tree (DT), K-nearest neighbors (KNN), logistic regression (LR), naive Bayes (NB), random forest (RF), multiplayer perceptron (MLP), extreme gradient boosting machine (XGB), gradient boosting decision tree (GBDT), linear support vector machine (SVM-linear), and non-linear support vector machine (SVM-nonlinear), were used to construct CAS screening models. The area under the receiver operating characteristic curve (auROC) and precision-recall curve (auPR) were used as measures of model performance. The SHapley Additive exPlanations (SHAP) method was used to demonstrate the interpretability of the optimal model.ResultsA total of 6315 records of patients undergoing carotid ultrasonography were collected; of these, 1632, 407, and 1141 patients were diagnosed with CAS in the training, internal validation, and external validation datasets, respectively. The GBDT model achieved the highest performance metrics with auROC of 0.860 (95% CI 0.839-0.880) in the internal validation dataset and 0.851 (95% CI 0.837-0.863) in the external validation dataset. Individuals with diabetes or those over 65 years of age showed low negative predictive value. In the interpretability analysis, age was the most important factor influencing the performance of the GBDT model, followed by sex and non-high-density lipoprotein cholesterol.ConclusionsThe ML models developed could provide good performance for CAS identification using routine health check-up indicators and could hopefully be applied in scenarios without ethnic and geographic heterogeneity for CAS prevention.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Enhanced machine learning approaches for OSA patient screening: model development and validation study
    Dai, Rongrong
    Yang, Kang
    Zhuang, Jiajing
    Yao, Ling
    Hu, Yiming
    Chen, Qingquan
    Zheng, Huaxian
    Zhu, Xi
    Ke, Jianfeng
    Zeng, Yifu
    Fan, Chunmei
    Chen, Xiaoyang
    Fan, Jimin
    Zhang, Yixiang
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [42] Development and validation of a machine-learning model for predicting postoperative pneumonia in aneurysmal subarachnoid hemorrhage
    Wang, Tong
    Hao, Jiahui
    Zhou, Jialei
    Chen, Gang
    Shen, Haitao
    Sun, Qing
    NEUROSURGICAL REVIEW, 2024, 47 (01)
  • [43] Development and Validation of a Machine-Learning Model for Prediction of Extubation Failure in Intensive Care Units
    Zhao, Qin-Yu
    Wang, Huan
    Luo, Jing-Chao
    Luo, Ming-Hao
    Liu, Le-Ping
    Yu, Shen-Ji
    Liu, Kai
    Zhang, Yi-Jie
    Sun, Peng
    Tu, Guo-Wei
    Luo, Zhe
    FRONTIERS IN MEDICINE, 2021, 8
  • [44] An ensemble framework for explainable geospatial machine learning models
    Liu, Lingbo
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2024, 132
  • [45] Explainable machine learning models to analyse maternal health
    Patel, Shivshanker Singh
    DATA & KNOWLEDGE ENGINEERING, 2023, 146
  • [46] Explainable Machine Learning for Improving Logistic Regression Models
    Yang, Yimin
    Wu, Min
    2021 IEEE 19TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2021,
  • [47] A machine-learning exploration of the exposome from preconception in early childhood atopic eczema, rhinitis and wheeze development
    Dong, Yizhi
    Lau, Hui Xing
    Suaini, Noor Hidayatul Aini
    Kee, Michelle Zhi Ling
    Ooi, Delicia Shu Qin
    Shek, Lynette Pei-chi
    Lee, Bee Wah
    Godfrey, Keith M.
    Tham, Elizabeth Huiwen
    Ong, Marcus Eng Hock
    Liu, Nan
    Wong, Limsoon
    Tan, Kok Hian
    Chan, Jerry Kok Yen
    Yap, Fabian Kok Peng
    Chong, Yap Seng
    Eriksson, Johan Gunnar
    Feng, Mengling
    Loo, Evelyn Xiu Ling
    ENVIRONMENTAL RESEARCH, 2024, 250
  • [48] Derivation and external validation of machine-learning models for risk stratification in chest pain with normal troponin
    Fernandez-Cisnal, Agustin
    Lopez-Ayala, Pedro
    Valero, Ernesto
    Koechlin, Luca
    Catarrala, Arturo
    Boeddinghaus, Jasper
    Noceda, Jose
    Nestelberger, Thomas
    Miro, Oscar
    Julio, Nunez
    Mueller, Christian
    Sanchis, Juan
    EUROPEAN HEART JOURNAL-ACUTE CARDIOVASCULAR CARE, 2023, 12 (11) : 743 - 752
  • [49] Editorial: Interpretable and explainable machine learning models in oncology
    Hrinivich, William Thomas
    Wang, Tonghe
    Wang, Chunhao
    FRONTIERS IN ONCOLOGY, 2023, 13
  • [50] Explainable Machine Learning Models Assessing Lending Risk
    Nassiri, Khalid
    Akhloufi, Moulay A.
    NAVIGATING THE TECHNOLOGICAL TIDE: THE EVOLUTION AND CHALLENGES OF BUSINESS MODEL INNOVATION, VOL 3, ICBT 2024, 2024, 1082 : 519 - 529