Classification based on event in survival machine learning analysis of cardiovascular disease cohort

被引:2
作者
Ahmad, Shokh Mukhtar [1 ,2 ]
Ahmed, Nawzad Muhammed [1 ]
机构
[1] Sulaymaniyah Univ, Coll Adm & Econ, Dept Stat & Informat, Kurdistan, Sulaymaniyah, Iraq
[2] Komar Univ Sci & Technol Sci, Dept Med Lab, Kurdistan, Sulaymaniyah, Iraq
关键词
Survival analysis; Machine learning; Logistic regression; SVM; Tree descent; Random forest; VARIABLE SELECTION;
D O I
10.1186/s12872-023-03328-2
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
The aim of this study is to assess the effectiveness of supervised learning classification models in predicting patient outcomes in a survival analysis problem involving cardiovascular patients with a significant cured fraction. The sample comprised 919 patients (365 females and 554 males) who were referred to Sulaymaniyah Cardiac Hospital and followed up for a maximum of 650 days between 2021 and 2023. During the research period, 162 patients (17.6%) died, and the cure fraction in this cohort was confirmed using the Mahler and Zhu test (P < 0.01). To determine the best patient status prediction procedure, several machine learning classifications were applied. The patients were classified into alive and dead using various machine learning algorithms, with almost similar results based on several indicators. However, random forest was identified as the best method in most indicators, with an Area under ROC of 0.934. The only weakness of this method was its relatively poor performance in correctly diagnosing deceased patients, whereas SVM with FP Rate of 0.263 performed better in this regard. Logistic and simple regression also showed better performance than other methods, with an Area under ROC of 0.911 and 0.909 respectively.
引用
收藏
页数:7
相关论文
共 21 条
  • [1] Mixture cure survival analysis model for cardio-vascular disease
    Ahmed, Nawzad Muhammed
    Ahmad, Shokh Mukhtar
    [J]. ELECTRONIC JOURNAL OF APPLIED STATISTICAL ANALYSIS, 2022, 15 (01) : 95 - 109
  • [2] Boffetta P, 2018, Encyclopedia of cancer, V3rd
  • [3] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [4] Breiman L., 1984, Classification and Regression Trees, DOI DOI 10.1201/9781315139470
  • [5] Fan JQ, 2002, ANN STAT, V30, P74
  • [6] Missing Data Analysis: Making It Work in the Real World
    Graham, John W.
    [J]. ANNUAL REVIEW OF PSYCHOLOGY, 2009, 60 : 549 - 576
  • [7] Hastie T., 2009, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, DOI [10.1007/978-0-387-84858-7, 10.1007/BF02985802, DOI 10.1007/978-0-387-84858-7]
  • [8] Machine learning for real-time aggregated prediction of hospital admission for emergency patients
    King, Zella
    Farrington, Joseph
    Utley, Martin
    Kung, Enoch
    Elkhodair, Samer
    Harris, Steve
    Sekula, Richard
    Gillham, Jonathan
    Li, Kezhi
    Crowe, Sonya
    [J]. NPJ DIGITAL MEDICINE, 2022, 5 (01)
  • [9] Kleinbaum DG, 2012, STAT BIOL HEALTH, P1, DOI 10.1007/978-1-4419-6646-9
  • [10] Krittanawong C., 2020, J AM COLL CARDIOLOGY, V13, P1916