Predicting Future Cardiovascular Events in Patients With Peripheral Artery Disease Using Electronic Health Record Data

被引:49
作者
Ross, Elsie Gyang [1 ,2 ]
Jung, Kenneth [2 ]
Dudley, Joel T. [3 ]
Li, Li [3 ,4 ]
Leeper, Nicholas J. [1 ]
Shah, Nigam H. [2 ]
机构
[1] Stanford Univ, Sch Med, Div Vasc Surg, Stanford, CA 94305 USA
[2] Stanford Univ, Sch Med, Ctr Biomed Informat Res, Stanford, CA 94305 USA
[3] Icahn Sch Med Mt Sinai, New York, NY 10029 USA
[4] Sema4, Stamford, CT USA
来源
CIRCULATION-CARDIOVASCULAR QUALITY AND OUTCOMES | 2019年 / 12卷 / 03期
关键词
electronic health records; machine learning; mortality; peripheral arterial disease; risk; COMMON DATA MODELS; RISK PREDICTION; MORTALITY; OUTCOMES; IDENTIFICATION; PERFORMANCE; PREVALENCE; MEDICINE; SYSTEM; SCORE;
D O I
10.1161/CIRCOUTCOMES.118.004741
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
BACKGROUND: Patients with peripheral artery disease (PAD) are at risk of major adverse cardiac and cerebrovascular events. There are no readily available risk scores that can accurately identify which patients are most likely to sustain an event, making it difficult to identify those who might benefit from more aggressive intervention. Thus, we aimed to develop a novel predictive model-using machine learning methods on electronic health record data-to identify which PAD patients are most likely to develop major adverse cardiac and cerebrovascular events. METHODS AND RESULTS: Data were derived from patients diagnosed with PAD at 2 tertiary care institutions. Predictive models were built using a common data model that allowed for utilization of both structured (coded) and unstructured (text) data. Only data from time of entry into the health system up to PAD diagnosis were used for modeling. Models were developed and tested using nested cross-validation. A total of 7686 patients were included in learning our predictive models. Utilizing almost 1000 variables, our best predictive model accurately determined which PAD patients would go on to develop major adverse cardiac and cerebrovascular events with an area under the curve of 0.81 (95% CI, 0.80-0.83). CONCLUSIONS: Machine learning algorithms applied to data in the electronic health record can learn models that accurately identify PAD patients at risk of future major adverse cardiac and cerebrovascular events, highlighting the great potential of electronic health records to provide automated risk stratification for cardiovascular diseases. Common data models that can enable cross-institution research and technology development could potentially be an important aspect of widespread adoption of newer risk-stratification models.
引用
收藏
页数:10
相关论文
共 51 条
  • [1] Learning statistical models of phenotypes using noisy labeled training data
    Agarwal, Vibhu
    Podchiyska, Tanya
    Banda, Juan M.
    Goel, Veena
    Leung, Tiffany I.
    Minty, Evan P.
    Sweeney, Timothy E.
    Gyang, Elsie
    Shah, Nigam H.
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2016, 23 (06) : 1166 - 1173
  • [2] Rivaroxaban with or without aspirin in patients with stable peripheral or carotid artery disease: an international, randomised, double-blind, placebo-controlled trial
    Anand, Sonia S.
    Bosch, Jackie
    Eikelboom, John W.
    Connolly, Stuart J.
    Diaz, Rafael
    Widimsky, Peter
    Aboyans, Victor
    Alings, Marco
    Kakkar, Ajay K.
    Keltai, Katalin
    Maggioni, Aldo P.
    Lewis, Basil S.
    Stoerk, Stefan
    Zhu, Jun
    Lopez-Jaramillo, Patricio
    O'Donnell, Martin
    Commerford, Patrick J.
    Vinereanu, Dragos
    Pogosova, Nana
    Ryden, Lars
    Fox, Keith A. A.
    Bhatt, Deepak L.
    Misselwitz, Frank
    Varigos, John D.
    Vanassche, Thomas
    Avezum, Alvaro A.
    Chen, Edmond
    Branch, Kelley
    Leong, Darryl P.
    Bangdiwala, Shrikant I.
    Hart, Robert G.
    Yusuf, Salim
    [J]. LANCET, 2018, 391 (10117) : 219 - 229
  • [3] The TIMI risk score for unstable angina/non-ST elevation MI - A method for prognostication and therapeutic decision making
    Antman, EM
    Cohen, M
    Bernink, PJLM
    McCabe, CH
    Horacek, T
    Papuchis, G
    Mautner, B
    Corbalan, R
    Radley, D
    Braunwald, E
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2000, 284 (07): : 835 - 842
  • [4] Leveraging the Electronic Health Record to Create an Automated Real-Time Prognostic Tool for Peripheral Arterial Disease
    Arruda-Olson, Adelaide M.
    Afzal, Naveed
    Mallipeddi, Vishnu Priya
    Said, Ahmad
    Pacha, Homam Moussa
    Moon, Sungrim
    Chaudhry, Alisha P.
    Scott, Christopher G.
    Bailey, Kent R.
    Rooke, Thom W.
    Wennberg, Paul W.
    Kaggal, Vinod C.
    Oderich, Gustavo S.
    Kullo, Iftikhar J.
    Nishimura, Rick A.
    Chaudhry, Rajeev
    Liu, Hongfang
    [J]. JOURNAL OF THE AMERICAN HEART ASSOCIATION, 2018, 7 (23):
  • [5] Banda JM, APHRODITE AUTOMATED
  • [6] Banda Juan M, 2017, AMIA Jt Summits Transl Sci Proc, V2017, P48
  • [7] Characterizing and Managing Missing Structured Data in Electronic Health Records: Data Analysis
    Beaulieu-Jones, Brett K.
    Lavage, Daniel R.
    Snyder, John W.
    Moore, Jason H.
    Pendergrass, Sarah A.
    Bauer, Christopher R.
    [J]. JMIR MEDICAL INFORMATICS, 2018, 6 (01)
  • [8] Statistical modeling: The two cultures
    Breiman, L
    [J]. STATISTICAL SCIENCE, 2001, 16 (03) : 199 - 215
  • [9] POINTS OF SIGNIFICANCE Statistics versus machine learning
    Bzdok, Danilo
    Altman, Naomi
    Krzywinski, Martin
    [J]. NATURE METHODS, 2018, 15 (04) : 232 - 233
  • [10] A Clinical Database-Driven Approach to Decision Support: Predicting Mortality Among Patients with Acute Kidney Injury
    Celi, Leo Anthony G.
    Tang, Robin J.
    Villarroel, Mauricio C.
    Davidzon, Guido A.
    Lester, William T.
    Chueh, Henry C.
    [J]. JOURNAL OF HEALTHCARE ENGINEERING, 2011, 2 (01) : 97 - 109