Machine learning approach for the detection of vitamin D level: a comparative study

被引:5
作者
Sancar, Nuriye [1 ]
Tabrizi, Sahar S. [2 ]
机构
[1] Near East Univ, Dept Math, TR-99138 Nicosia, Turkiye
[2] Univ Tabriz, Fac Elect & Comp Engn, Dept Comp Engn, Tabriz, Iran
关键词
Machine learning models; Vitamin D; Metabolic syndrome; Comparative study; Multicollinearity; Classification algorithm; ORDINAL LOGISTIC-REGRESSION; D DEFICIENCY; METABOLIC SYNDROME; PREDICTION; CLASSIFICATION; REGULARIZATION; DIAGNOSIS; MODELS;
D O I
10.1186/s12911-023-02323-z
中图分类号
R-058 [];
学科分类号
摘要
BackgroundAfter the World Health Organization declared the COVID-19 pandemic, the role of Vitamin D has become even more critical for people worldwide. The most accurate way to define vitamin D level is 25-hydroxy vitamin D(25-OH-D) blood test. However, this blood test is not always feasible. Most data sets used in health science research usually contain highly correlated features, which is referred to as multicollinearity problem. This problem can lead to misleading results and overfitting problems in the ML training process. Therefore, the proposed study aims to determine a clinically acceptable ML model for the detection of the vitamin D status of the North Cyprus adult participants accurately, without the need to determine 25-OH-D level, taking into account the multicollinearity problem.MethodThe study was conducted with 481 observations who applied voluntarily to Internal Medicine Department at NEU Hospital. The classification performance of four conventional supervised ML models, namely, Ordinal logistic regression(OLR), Elastic-net ordinal regression(ENOR), Support Vector Machine(SVM), and Random Forest (RF) was compared. The comparative analysis is performed regarding the model's sensitivity to the participant's metabolic syndrome(MtS)'positive status, hyper-parameter tuning, sensitivities to the size of training data, and the classification performance of the models.ResultsDue to the presence of multicollinearity, the findings showed that the performance of the SVM(RBF) is obviously negatively affected when the test is examined. Moreover, it can be obviously detected that RF is more robust than other models when the variations in the size of training data are examined. This experiment's result showed that the selected RF and ENOR showed better performances than the other two models when the size of training samples was reduced. Since the multicollinearity is more severe in the small samples, it can be concluded that RF and ENOR are not affected by the presence of the multicollinearity problem. The comparative analysis revealed that the RF classifier performed better and was more robust than the other proposed models in terms of accuracy (0.94), specificity (0.96), sensitivity or recall (0.94), precision (0.95), F1-score (0.95), and Cohen's kappa (0.90).ConclusionIt is evident that the RF achieved better than the SVM(RBF), ENOR, and OLR. These comparison findings will be applied to develop a Vitamin D level intelligent detection system for being used in routine clinical, biochemical tests, and lifestyle characteristics of individuals to decrease the cost and time of vitamin D level detection.
引用
收藏
页数:19
相关论文
共 50 条
  • [21] Lipreading Using a Comparative Machine Learning Approach
    Thabet, Ziad
    Nabih, Amr
    Azmi, Karim
    Samy, Youssef
    Khoriba, Ghada
    Elshehaly, Mai
    PROCEEDINGS OF 2018 FIRST INTERNATIONAL WORKSHOP ON DEEP AND REPRESENTATION LEARNING (IWDRL), 2018, : 19 - 25
  • [22] Serum vitamin D level is associated with obstructive sleep apnea state and severity
    Abbas, Ahmad
    Lutfy, Samah M.
    EGYPTIAN JOURNAL OF CHEST DISEASES AND TUBERCULOSIS, 2020, 69 (01): : 256 - 258
  • [23] Comparative analysis of breast cancer detection using machine learning and biosensors
    Amethiya, Yash
    Pipariya, Prince
    Patel, Shlok
    Shah, Manan
    INTELLIGENT MEDICINE, 2022, 2 (02): : 69 - 81
  • [24] Machine learning algorithms for diabetes detection: a comparative evaluation of performance of algorithms
    Saxena, Surabhi
    Mohapatra, Debashish
    Padhee, Subhransu
    Sahoo, Goutam Kumar
    EVOLUTIONARY INTELLIGENCE, 2023, 16 (02) : 587 - 603
  • [25] A comparative survey of Machine Learning classification Algorithms for Breast Cancer Detection
    Kaklamanis, Markos Marios
    Filippakis, Michael E.
    PROCEEDINGS OF THE 23RD PAN-HELLENIC CONFERENCE OF INFORMATICS (PCI 2019), 2019, : 97 - 103
  • [26] An Advanced Machine Learning Approach to Generalised Epileptic Seizure Detection
    Fergus, Paul
    Hignett, David
    Hussain, Abir Jaffar
    Al-Jumeily, Dhiya
    INTELLIGENT COMPUTING IN BIOINFORMATICS, 2014, 8590 : 112 - 118
  • [27] Machine Learning Algorithms for Breast Cancer Detection in Mammography Images: A Comparative Study
    de Miranda Almeida, Rhaylander Mendes
    Chen, Dehua
    da Silva Filho, Agnaldo Lopes
    Brandao, Wladmir Cardoso
    PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS (ICEIS 2021), VOL 1, 2021, : 660 - 667
  • [28] A Comparative Study of Existing Machine Learning Approaches for Parkinson's Disease Detection
    Pahuja, Gunjan
    Nagabhushan, T. N.
    IETE JOURNAL OF RESEARCH, 2021, 67 (01) : 4 - 14
  • [29] Water Desalination Fault Detection Using Machine Learning Approaches: A Comparative Study
    Derbali, Morched
    Buhari, Seyed M.
    Tsaramirsis, Georgios
    Stojmenovic, Milos
    Jerbi, H.
    Abdelkrim, M. N.
    Al-Beirutty, Mohammad H.
    IEEE ACCESS, 2017, 5 : 23266 - 23275
  • [30] Performance comparative study of machine learning algorithms for automobile insurance fraud detection
    Itri, Bouzgarne
    Mohamed, Youssfi
    Mohammed, Qbadou
    Omar, Bouattane
    2019 THIRD INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING IN DATA SCIENCES (ICDS 2019), 2019,