Machine learning approach for the detection of vitamin D level: a comparative study

被引:5
|
作者
Sancar, Nuriye [1 ]
Tabrizi, Sahar S. [2 ]
机构
[1] Near East Univ, Dept Math, TR-99138 Nicosia, Turkiye
[2] Univ Tabriz, Fac Elect & Comp Engn, Dept Comp Engn, Tabriz, Iran
关键词
Machine learning models; Vitamin D; Metabolic syndrome; Comparative study; Multicollinearity; Classification algorithm; ORDINAL LOGISTIC-REGRESSION; D DEFICIENCY; METABOLIC SYNDROME; PREDICTION; CLASSIFICATION; REGULARIZATION; DIAGNOSIS; MODELS;
D O I
10.1186/s12911-023-02323-z
中图分类号
R-058 [];
学科分类号
摘要
BackgroundAfter the World Health Organization declared the COVID-19 pandemic, the role of Vitamin D has become even more critical for people worldwide. The most accurate way to define vitamin D level is 25-hydroxy vitamin D(25-OH-D) blood test. However, this blood test is not always feasible. Most data sets used in health science research usually contain highly correlated features, which is referred to as multicollinearity problem. This problem can lead to misleading results and overfitting problems in the ML training process. Therefore, the proposed study aims to determine a clinically acceptable ML model for the detection of the vitamin D status of the North Cyprus adult participants accurately, without the need to determine 25-OH-D level, taking into account the multicollinearity problem.MethodThe study was conducted with 481 observations who applied voluntarily to Internal Medicine Department at NEU Hospital. The classification performance of four conventional supervised ML models, namely, Ordinal logistic regression(OLR), Elastic-net ordinal regression(ENOR), Support Vector Machine(SVM), and Random Forest (RF) was compared. The comparative analysis is performed regarding the model's sensitivity to the participant's metabolic syndrome(MtS)'positive status, hyper-parameter tuning, sensitivities to the size of training data, and the classification performance of the models.ResultsDue to the presence of multicollinearity, the findings showed that the performance of the SVM(RBF) is obviously negatively affected when the test is examined. Moreover, it can be obviously detected that RF is more robust than other models when the variations in the size of training data are examined. This experiment's result showed that the selected RF and ENOR showed better performances than the other two models when the size of training samples was reduced. Since the multicollinearity is more severe in the small samples, it can be concluded that RF and ENOR are not affected by the presence of the multicollinearity problem. The comparative analysis revealed that the RF classifier performed better and was more robust than the other proposed models in terms of accuracy (0.94), specificity (0.96), sensitivity or recall (0.94), precision (0.95), F1-score (0.95), and Cohen's kappa (0.90).ConclusionIt is evident that the RF achieved better than the SVM(RBF), ENOR, and OLR. These comparison findings will be applied to develop a Vitamin D level intelligent detection system for being used in routine clinical, biochemical tests, and lifestyle characteristics of individuals to decrease the cost and time of vitamin D level detection.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Machine learning approach for the detection of vitamin D level: a comparative study
    Nuriye Sancar
    Sahar S. Tabrizi
    BMC Medical Informatics and Decision Making, 23
  • [2] Exploring a Novel Machine Learning Approach for Evaluating Parkinson's Disease, Duration, and Vitamin D Level
    Ali, Md. Asraf
    Morol, Md. Kishor
    Mridha, Muhammad F.
    Fahad, Nafiz
    Huda, Md Sadi Al
    Ahmed, Nasim
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (12) : 625 - 632
  • [3] DECISION TREE-BASED CLASSIFICATION APPROACH TO DISCOVER FACTORS AFFECTING VITAMIN D LEVEL WITH MACHINE LEARNING
    Unal, Ceyda
    Cilgin, Cihan
    Albas, Suleyman
    Koc, Esra Meltem
    JOURNAL OF BASIC AND CLINICAL HEALTH SCIENCES, 2024, 8 (02): : 336 - 348
  • [4] Machine learning approaches to constructing predictive models of vitamin D deficiency in a hypertensive population: a comparative study
    Garcia Carretero, Rafael
    Vigil-Medina, Luis
    Barquero-Perez, Oscar
    Mora-Jimenez, Inmaculada
    Soguero-Ruiz, Cristina
    Ramos-Lopez, Javier
    INFORMATICS FOR HEALTH & SOCIAL CARE, 2021, 46 (04): : 355 - 369
  • [5] Comparative approach on crop detection using machine learning and deep learning techniques
    Nithya, V.
    Josephine, M. S.
    Jeyabalaraja, V.
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2024, 15 (09) : 4636 - 4648
  • [6] An Integrated Machine Learning Framework for Fraud Detection: A Comparative and Comprehensive Approach
    Ouazzane, Karim
    Polykarpou, Thekla
    Patel, Yogesh
    Li, Jun
    INTERNATIONAL JOURNAL OF INFORMATION SECURITY AND PRIVACY, 2022, 16 (01)
  • [7] Fall Detection and Monitoring using Machine Learning: A Comparative Study
    Edeib, Shaima R. M.
    Dziyauddin, Rudzidatul Akmam
    Amir, Nur Izdihar Muhd
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (02) : 723 - 728
  • [8] Comparative Study of Machine Learning Algorithm for Intrusion Detection System
    Sravani, K.
    Srinivasu, P.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON FRONTIERS OF INTELLIGENT COMPUTING: THEORY AND APPLICATIONS (FICTA) 2013, 2014, 247 : 189 - 196
  • [9] Comparative Study of Machine Learning Algorithms for SMS Spam Detection
    Alzahrani, Amani
    Rawat, Danda B.
    2019 IEEE SOUTHEASTCON, 2019,
  • [10] Comparative study of supervised machine learning techniques for intrusion detection
    Gharibian, Farnaz
    Ghorbani, Ali A.
    CNSR 2007: PROCEEDINGS OF THE FIFTH ANNUAL CONFERENCE ON COMMUNICATION NETWORKS AND SERVICES RESEARCH, 2007, : 350 - +