Logistic regression was as good as machine learning for predicting major chronic diseases

被引:259
作者
Nusinovici, Simon [1 ]
Tham, Yih Chung [1 ,3 ]
Yan, Marco Yu Chak [1 ]
Ting, Daniel Shu Wei [1 ,3 ]
Li, Jialiang [1 ,4 ]
Sabanayagam, Charumathi [1 ,3 ]
Wong, Tien Yin [1 ,2 ,3 ]
Cheng, Ching-Yu [1 ,2 ,3 ]
机构
[1] Singapore Natl Eye Ctr, Singapore Eye Res Inst, Singapore, Singapore
[2] Natl Univ Singapore, Yong Loo Lin Sch Med, Dept Ophthalmol, Singapore, Singapore
[3] Duke NUS Med Sch, Ophthalmol & Visual Sci Acad Clin Programme, Singapore, Singapore
[4] Natl Univ Singapore, Dept Stat & Appl Probabil, Singapore, Singapore
基金
英国医学研究理事会;
关键词
Machine learning; Logistic regression; Prognostic modeling; Chronic diseases; Interaction; Nonlinearity; SINGAPORE MALAY EYE; CONVENTIONAL REGRESSION; CARDIOVASCULAR-DISEASE; RISK PREDICTION; METHODOLOGY; CLASSIFICATION; RATIONALE; PROGNOSIS; MORTALITY; DIAGNOSIS;
D O I
10.1016/j.jclinepi.2020.03.002
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Objective: To evaluate the performance of machine learning (ML) algorithms and to compare them with logistic regression for the prediction of risk of cardiovascular diseases (CVDs), chronic kidney disease (CKD), diabetes (DM), and hypertension (HTN) and in a prospective cohort study using simple clinical predictors. Study Design and Setting: We conducted analyses in a population-based cohort study in Asian adults (n = 6,762). Five different ML models were considered-single-hidden-layer neural network, support vector machine, random forest, gradient boosting machine, and k-nearest neighbor-and were compared with standard logistic regression. Results: The incidences at 6 years of CVD, CKD, DM, and HTN cases were 4.0%, 7.0%, 9.2%, and 34.6%, respectively. Logistic regression reached the highest area under the receiver operating characteristic curve for CKD (0.905 [0.88, 0.93]) and DM (0.768 [0.73, 0.81]) predictions. For CVD and HTN, the best models were neural network (0.753 [0.70, 0.81]) and support vector machine (0.780 [0.747, 0.812]), respectively. However, the differences with logistic regression were small (less than 1%) and nonsignificant. Logistic regression, gradient boosting machine, and neural network were systematically ranked among the best models. Conclusion: Logistic regression yields as good performance as ML models to predict the risk of major chronic diseases with low incidence and simple clinical predictors. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页码:56 / 69
页数:14
相关论文
共 50 条
  • [1] Comparison between traditional logistic regression and machine learning for predicting mortality in adult sepsis patients
    Wu, Hongsheng
    Liao, Biling
    Ji, Tengfei
    Ma, Keqiang
    Luo, Yumei
    Zhang, Shengmin
    FRONTIERS IN MEDICINE, 2025, 11
  • [2] Machine learning for predicting chronic diseases: a systematic review
    Delpino, F. M.
    Costa, A. K.
    Farias, S. R.
    Chiavegatto Filho, A. D. P.
    Arcencio, R. A.
    Nunes, B. P.
    PUBLIC HEALTH, 2022, 205 : 14 - 25
  • [3] Comparison of Statistical Logistic Regression and RandomForest Machine Learning Techniques in Predicting Diabetes
    Daghistani, Tahani
    Alshammari, Riyad
    JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2020, 11 (02) : 78 - 83
  • [4] A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models
    Christodoulou, Evangelia
    Ma, Jie
    Collins, Gary S.
    Steyerberg, Ewout W.
    Verbakel, Jan Y.
    Van Calster, Ben
    JOURNAL OF CLINICAL EPIDEMIOLOGY, 2019, 110 : 12 - 22
  • [5] Predicting postoperative pulmonary infection in elderly patients undergoing major surgery: a study based on logistic regression and machine learning models
    Jie Liu
    Xia Li
    Yanting Wang
    Zhenzhen Xu
    Yong Lv
    Yuyao He
    Lu Chen
    Yiqi Feng
    Guoyang Liu
    Yunxiao Bai
    Wanli Xie
    Qingping Wu
    BMC Pulmonary Medicine, 25 (1)
  • [6] Comparison of Multivariable Logistic Regression and Machine Learning Models for Predicting Bronchopulmonary Dysplasia or Death in Very Preterm Infants
    Khurshid, Faiza
    Coo, Helen
    Khalil, Amal
    Messiha, Jonathan
    Ting, Joseph Y.
    Wong, Jonathan
    Shah, Prakesh S.
    FRONTIERS IN PEDIATRICS, 2021, 9
  • [7] Heart Disease Prediction Using Logistic Regression Machine Learning Model
    Hrvat, Faris
    Spahic, Lemana
    Aleta, Amina
    MEDICON 2023 AND CMBEBIH 2023, VOL 1, 2024, 93 : 654 - 662
  • [8] Predicting Type 2 Diabetes Using Logistic Regression and Machine Learning Approaches
    Joshi, Ram D.
    Dhakal, Chandra K.
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2021, 18 (14)
  • [9] An MRI brain tumour detection using logistic regression-based machine learning model
    Gajula, Srinivasarao
    Rajesh, V
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2024, 15 (01) : 124 - 134
  • [10] Explainable Machine Learning for Improving Logistic Regression Models
    Yang, Yimin
    Wu, Min
    2021 IEEE 19TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2021,