Comparison of Statistical Logistic Regression and RandomForest Machine Learning Techniques in Predicting Diabetes

被引:35
作者
Daghistani, Tahani [1 ]
Alshammari, Riyad [1 ]
机构
[1] King Saud Bin Abdulaziz Univ Hlth Sci KSAU HS, King Abdullah Int Med Res Ctr KAIMRC, Coll Publ Hlth & Hlth Informat, Hlth Informat Dept,Minist Natl Guard Hlth Affairs, Riyadh, Saudi Arabia
关键词
diabetes; predictive model; machine learning; RandomForest; logistic regression;
D O I
10.12720/jait.11.2.78-83
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Diabetes is one of the global concerns in the healthcare domain and one of the leading challenges locally in Saudi Arabia. The prevalence of diabetes is anticipated to rise; early prediction of individuals at high risk of diabetes is a significant challenge. This study aims to compare RandomForest machine learning algorithm and Logistic Regression algorithm towards the prediction of diabetes. We analyzed 66,325 records that extracted from the Ministry of National Guard Hospital Affairs (MNGHA) databases in Saudi Arabia between 2013 and 2015. Both Machine Learning algorithms were applied to predict diabetes based on 18 risk factors. The evaluation criteria to compare the two algorithms were based on precision, Recall, True Positive rate, False Negative rate, F-measure and Area under the curve. The overall prevalence of diabetes in the data set is 64.47%. Male represents 55.50% of the data set while female represents 44.50%. For RandomForest (RF) model, the precision, Recall, True Positive Rate, False Positive Rate and F-measure value for predicting diabetes were 0.883, 0.88, 0.88, 0.188 and 0.876, respectively, while Logistic Regression model were only 0.692, 0.703, 0.703,0.454 and 0.675, respectively. Area under the ROC curve (AUC) value was 0.944 for the RF model and 0.708 for Logistic Regression model, which demonstrates higher predictive performance for RF than the Logistic Regression model. The RF algorithm showed superior prediction performance over Logistic Regression technique in predicting diabetes based on various matrices.
引用
收藏
页码:78 / 83
页数:6
相关论文
共 50 条
  • [21] Explainable Machine Learning for Improving Logistic Regression Models
    Yang, Yimin
    Wu, Min
    2021 IEEE 19TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2021,
  • [22] Machine learning methods are comparable to logistic regression techniques in predicting severe walking limitation following total knee arthroplasty
    Yong-Hao Pua
    Hakmook Kang
    Julian Thumboo
    Ross Allan Clark
    Eleanor Shu-Xian Chew
    Cheryl Lian-Li Poon
    Hwei-Chi Chong
    Seng-Jin Yeo
    Knee Surgery, Sports Traumatology, Arthroscopy, 2020, 28 : 3207 - 3216
  • [23] Machine Learning Techniques for Predicting Metamaterial Microwave Absorption Performance: A Comparison
    Jain, Prince
    Chhabra, Himanshu
    Chauhan, Urvashi
    Prakash, Krishna
    Samant, Piyush
    Singh, Dhiraj Kumar
    Soliman, Mohamed S.
    Islam, Mohammad Tariqul
    IEEE ACCESS, 2023, 11 : 128774 - 128783
  • [24] Diabetes Classification Using Machine Learning Techniques
    Phongying, Methaporn
    Hiriote, Sasiprapa
    COMPUTATION, 2023, 11 (05)
  • [25] Machine learning methods are comparable to logistic regression techniques in predicting severe walking limitation following total knee arthroplasty
    Pua, Yong-Hao
    Kang, Hakmook
    Thumboo, Julian
    Clark, Ross Allan
    Chew, Eleanor Shu-Xian
    Poon, Cheryl Lian-Li
    Chong, Hwei-Chi
    Yeo, Seng-Jin
    KNEE SURGERY SPORTS TRAUMATOLOGY ARTHROSCOPY, 2020, 28 (10) : 3207 - 3216
  • [26] Advancing Breast Cancer Prediction using Logistic Regression and Machine Learning Techniques
    Bhuria, Ruchika
    Gill, Kanwarpartap Singh
    Malhotra, Sonal
    Singh, Mukesh
    2ND INTERNATIONAL CONFERENCE ON SUSTAINABLE COMPUTING AND SMART SYSTEMS, ICSCSS 2024, 2024, : 1374 - 1377
  • [27] Developing and microsimulating demographic dynamics for an integrated urban model: a comparison between logistic regression and machine learning techniques
    Khalil, Mohamad Ali
    Fatmi, Mahmudur Rahman
    Orvin, Muntahith
    TRANSPORTATION, 2024, 52 (4) : 1621 - 1655
  • [28] Logistic regression analysis and machine learning for predicting post-stroke gait independence: a retrospective study
    Miyazaki, Yuta
    Kawakami, Michiyuki
    Kondo, Kunitsugu
    Hirabe, Akiko
    Kamimoto, Takayuki
    Akimoto, Tomonori
    Hijikata, Nanako
    Tsujikawa, Masahiro
    Honaga, Kaoru
    Suzuki, Kanjiro
    Tsuji, Tetsuya
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [29] An Analysis of various Machine Learning Techniques for Predicting Diabetes in its Early Stages
    Durga, P.
    Sudhakar, T.
    JOURNAL OF PHARMACEUTICAL NEGATIVE RESULTS, 2022, 13 : 2030 - 2038
  • [30] Comparison of machine learning techniques for predicting porosity of chalk
    Nourani, Meysam
    Alali, Najeh
    Samadianfard, Saeed
    Band, Shahab S.
    Chau, Kwok-wing
    Shu, Chi-Min
    JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2022, 209