Can machine-learning improve cardiovascular risk prediction using routine clinical data?

被引:691
作者
Weng, Stephen F. [1 ,2 ]
Reps, Jenna [3 ,4 ]
Kai, Joe [1 ,2 ]
Garibaldi, Jonathan M. [3 ,4 ]
Qureshi, Nadeem [1 ,2 ]
机构
[1] Univ Nottingham, NIHR Sch Primary Care Res, Nottingham, England
[2] Univ Nottingham, Sch Med, Div Primary Care, Nottingham, England
[3] Univ Nottingham, Adv Data Anal Ctr, Nottingham, England
[4] Univ Nottingham, Sch Comp Sci, Nottingham, England
来源
PLOS ONE | 2017年 / 12卷 / 04期
关键词
CORONARY EVENTS; VALIDATION; MODELS; REGRESSION; DISEASE; MUNSTER; PROFILE; WOMEN; MEN;
D O I
10.1371/journal.pone.0174944
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background Current approaches to predict cardiovascular risk fail to identify many people who would benefit from preventive treatment, while others receive unnecessary intervention. Machinelearning offers opportunity to improve accuracy by exploiting complex interactions between risk factors. We assessed whether machine-learning can improve cardiovascular risk prediction. Methods Prospective cohort study using routine clinical data of 378,256 patients from UK family practices, free from cardiovascular disease at outset. Four machine-learning algorithms (random forest, logistic regression, gradient boosting machines, neural networks) were compared to an established algorithm (American College of Cardiology guidelines) to predict first cardiovascular event over 10-years. Predictive accuracy was assessed by area under the 'receiver operating curve' (AUC); and sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) to predict 7.5% cardiovascular risk (threshold for initiating statins). Findings 24,970 incident cardiovascular events (6.6%) occurred. Compared to the established risk prediction algorithm (AUC 0.728, 95% CI 0.723-0.735), machine-learning algorithms improved prediction: random forest + 1.7% (AUC 0.745, 95% CI 0.739-0.750), logistic regression + 3.2% (AUC 0.760, 95% CI 0.755-0.766), gradient boosting + 3.3% (AUC 0.761, 95% CI 0.755-0.766), neural networks + 3.6% (AUC 0.764, 95% CI 0.759-0.769). The highest achieving (neural networks) algorithm predicted 4,998/7,404 cases (sensitivity 67.5%, PPV 18.4%) and 53,458/75,585 non-cases (specificity 70.7%, NPV 95.7%), correctly predicting 355 (+ 7.6%) more patients who developed cardiovascular disease compared to the established algorithm. Conclusions Machine-learning significantly improves accuracy of cardiovascular risk prediction, increasing the number of patients identified who could benefit from preventive treatment, while avoiding unnecessary treatment of others.
引用
收藏
页数:14
相关论文
共 43 条
  • [1] [Anonymous], CIRCULATION
  • [2] [Anonymous], PERS MED STRAT
  • [3] [Anonymous], 2015, The precision medicine initiative cohort program building a research foundation for 21st century medicine
  • [4] Simple scoring scheme for calculating the risk of acute coronary events based on the 10-year follow-up of the Prospective Cardiovascular Munster (PROCAM) study
    Assmann, G
    Cullen, P
    Schulte, H
    [J]. CIRCULATION, 2002, 105 (03) : 310 - 315
  • [5] Batista GEAPA, 2003, APPL ARTIF INTELL, V17, P519, DOI 10.1080/08839510390219309
  • [6] Bengio Yoshua, 2012, Neural Networks: Tricks of the Trade. Second Edition: LNCS 7700, P437, DOI 10.1007/978-3-642-35289-8_26
  • [7] Adherence to and beliefs in lipid-lowering medical treatments: A structural equation modeling approach including the necessity-concern framework
    Berglund, Erik
    Lytsy, Per
    Westerling, Ragnar
    [J]. PATIENT EDUCATION AND COUNSELING, 2013, 91 (01) : 105 - 112
  • [8] Representative and optimal use of body mass index (BMI) in the UK Clinical Practice Research Datalink (CPRD)
    Bhaskaran, Krishnan
    Forbes, Harriet J.
    Douglas, Ian
    Leon, David A.
    Smeeth, Liam
    [J]. BMJ OPEN, 2013, 3 (09):
  • [9] Breiman L., 2001, Machine Learning, V45, P5
  • [10] dRHP-PseRA: detecting remote homology proteins using profile-based pseudo protein sequence and rank aggregation
    Chen, Junjie
    Long, Ren
    Wang, Xiao-long
    Liu, Bin
    Chou, Kuo-Chen
    [J]. SCIENTIFIC REPORTS, 2016, 6