Multivariable prediction model of complications derived from diabetes mellitus using machine learning on scarce highly unbalanced data

被引:0
|
作者
Colmenares-Mejia, Claudia C. [1 ]
Rincon-Acuna, Juan C. [2 ,3 ]
Cely, Andres [1 ,4 ]
Gonzalez-Velez, Abel E. [5 ]
Castillo, Andrea [6 ]
Murcia, Jossie [7 ]
Isaza-Ruget, Mario A. [8 ]
机构
[1] Fdn Univ Sanitas, Bogota, DC, Colombia
[2] Univ Santander, Campus Lagos del Cacique, Bucaramanga, Santander, Colombia
[3] Keralty, Corp Data Management, Bogota, DC, Colombia
[4] Univ Nacl Colombia, Bogota, DC, Colombia
[5] Univ Hosp Torrejon, Prevent Med Serv, Torrejon De Ardoz, Spain
[6] EPS Sanitas, Direcc Gest Conocimiento, Bogota, DC, Colombia
[7] Fdn Univ Sanitas, Inst Gerencia & Gest Sanitaria, Bogota, DC, Colombia
[8] Fdn Univ Sanitas, Res Grp INPAC, Bogota, DC, Colombia
关键词
Complications; Diabetes mellitus; Machine learning; Predictive analytics; Risk predictions;
D O I
10.1007/s13410-023-01264-7
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
BackgroundDiabetes mellitus (DM) increases the risk complications in addition to mortality. Quantifying the risk of complications using artificial intelligence could be a way to design comprehensive patient healthcare programs.ObjectivePredicting the probability of macro and microvascular complications in patients with DM through Machine Learning.MethodsRetrospective cohort study. Based on an outpatient follow-up program for diabetic patients, 64,081 records and 287 variables were identified, with highly unbalanced data. Predictive models for chronic kidney disease (CKD), lower extremity amputation (LEA), coronary heart disease (CHD), and early mortality (MOR) were developed. An exhaustive computational method was conducted to find the best combination between machine learning (ML) algorithms and sampling method.ResultsThe best model was determined by assessing its performance through the heuristics obtained from a comprehensive analysis of the accuracy and F1 values for ML, sampling, and dataset. Regarding each complication, 99.9% accuracy was obtained for LEA, 94.3% for CHD, 97.4% for MOR, and 98.8% for CKD. F1 was assessed to identify false positives, with 84.5% for CKD, 63.6% for MOR, 46.2% for LEA, and 44.8% for CHD.ConclusionsThis ML model can be applied to predict CHD, CKD, and MOR. The success of ML predictions lies in the clinical definition of initial variables and their simplification for obtaining variables based on which the algorithms can identify patients that are likely to develop a complication. For clinical application of this system, it is necessary to assess the cross performance of metrics, as found here (accuracy higher 95% and F1-Score higher than 80%).
引用
收藏
页码:528 / 538
页数:11
相关论文
共 50 条
  • [21] Glycemic and lipid variability for predicting complications and mortality in diabetes mellitus using machine learning
    Sharen Lee
    Jiandong Zhou
    Wing Tak Wong
    Tong Liu
    William K. K. Wu
    Ian Chi Kei Wong
    Qingpeng Zhang
    Gary Tse
    BMC Endocrine Disorders, 21
  • [22] Performance analysis and prediction of type 2 diabetes mellitus based on lifestyle data using machine learning approaches
    Ganie, Shahid Mohammad
    Malik, Majid Bashir
    Arif, Tasleem
    JOURNAL OF DIABETES AND METABOLIC DISORDERS, 2022, 21 (01) : 339 - 352
  • [23] Performance analysis and prediction of type 2 diabetes mellitus based on lifestyle data using machine learning approaches
    Shahid Mohammad Ganie
    Majid Bashir Malik
    Tasleem Arif
    Journal of Diabetes & Metabolic Disorders, 2022, 21 : 339 - 352
  • [24] Lack of Data Sharing Despite Data Availability Statements in Studies Using Machine Learning Models for Prediction of Gestational Diabetes Mellitus
    Germaine, Mark
    Healy, Graham
    Egan, Brendan
    DIABETES CARE, 2024, 47 (10) : E78 - E79
  • [25] An early sepsis prediction model utilizing machine learning and unbalanced data processing in a clinical context
    Zhou, Luyao
    Shao, Min
    Wang, Cui
    Wang, Yu
    PREVENTIVE MEDICINE REPORTS, 2024, 45
  • [26] All-Cause Mortality Prediction in Subjects with Diabetes Mellitus Using a Machine Learning Model and Shapley Values
    Mirea, Oana
    Oghli, Mostafa Ghelich
    Neagoe, Oana
    Berceanu, Mihaela
    Tieranu, Eugen
    Moraru, Liviu
    Raicea, Victor
    Donoiu, Ionut
    DIABETOLOGY, 2025, 6 (01):
  • [27] Early diagnosis of diabetes mellitus using data mining and machine learning techniques
    Deepa, K.
    Kumar, C. Ranjeeth
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (03) : 3999 - 4011
  • [28] Diabetes Disease Prediction using Machine Learning on Big Data of Healthcare
    Mir, Ayman
    Dhage, Sudhir N.
    2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2018,
  • [29] Machine-Learning-Based Diabetes Prediction Using Multisensor Data
    Site, Aditi
    Nurmi, Jari
    Lohan, Elena Simona
    IEEE SENSORS JOURNAL, 2023, 23 (22) : 28370 - 28377
  • [30] Diabetes Mellitus Disease Prediction Using Machine Learning Classifiers with Oversampling and Feature Augmentation
    Ahamed, B. Shamreen
    Arya, Meenakshi S.
    Nancy, Auxilia Osvin V.
    ADVANCES IN HUMAN-COMPUTER INTERACTION, 2022, 2022