Predicting Future Cardiovascular Events in Patients With Peripheral Artery Disease Using Electronic Health Record Data

被引:49
作者
Ross, Elsie Gyang [1 ,2 ]
Jung, Kenneth [2 ]
Dudley, Joel T. [3 ]
Li, Li [3 ,4 ]
Leeper, Nicholas J. [1 ]
Shah, Nigam H. [2 ]
机构
[1] Stanford Univ, Sch Med, Div Vasc Surg, Stanford, CA 94305 USA
[2] Stanford Univ, Sch Med, Ctr Biomed Informat Res, Stanford, CA 94305 USA
[3] Icahn Sch Med Mt Sinai, New York, NY 10029 USA
[4] Sema4, Stamford, CT USA
来源
CIRCULATION-CARDIOVASCULAR QUALITY AND OUTCOMES | 2019年 / 12卷 / 03期
关键词
electronic health records; machine learning; mortality; peripheral arterial disease; risk; COMMON DATA MODELS; RISK PREDICTION; MORTALITY; OUTCOMES; IDENTIFICATION; PERFORMANCE; PREVALENCE; MEDICINE; SYSTEM; SCORE;
D O I
10.1161/CIRCOUTCOMES.118.004741
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
BACKGROUND: Patients with peripheral artery disease (PAD) are at risk of major adverse cardiac and cerebrovascular events. There are no readily available risk scores that can accurately identify which patients are most likely to sustain an event, making it difficult to identify those who might benefit from more aggressive intervention. Thus, we aimed to develop a novel predictive model-using machine learning methods on electronic health record data-to identify which PAD patients are most likely to develop major adverse cardiac and cerebrovascular events. METHODS AND RESULTS: Data were derived from patients diagnosed with PAD at 2 tertiary care institutions. Predictive models were built using a common data model that allowed for utilization of both structured (coded) and unstructured (text) data. Only data from time of entry into the health system up to PAD diagnosis were used for modeling. Models were developed and tested using nested cross-validation. A total of 7686 patients were included in learning our predictive models. Utilizing almost 1000 variables, our best predictive model accurately determined which PAD patients would go on to develop major adverse cardiac and cerebrovascular events with an area under the curve of 0.81 (95% CI, 0.80-0.83). CONCLUSIONS: Machine learning algorithms applied to data in the electronic health record can learn models that accurately identify PAD patients at risk of future major adverse cardiac and cerebrovascular events, highlighting the great potential of electronic health records to provide automated risk stratification for cardiovascular diseases. Common data models that can enable cross-institution research and technology development could potentially be an important aspect of widespread adoption of newer risk-stratification models.
引用
收藏
页数:10
相关论文
共 51 条
  • [51] Zozus Meredith N, 2016, AMIA Jt Summits Transl Sci Proc, V2016, P279