Automated risk assessment of newly detected atrial fibrillation poststroke from electronic health record data using machine learning and natural language processing

被引:3
|
作者
Sung, Sheng-Feng [1 ,2 ]
Sung, Kuan-Lin [3 ]
Pan, Ru-Chiou [4 ]
Lee, Pei-Ju [5 ,6 ]
Hu, Ya-Han [7 ]
机构
[1] Ditmanson Med Fdn, Dept Internal Med, Div Neurol, Chiayi Christian Hosp, Chiayi, Taiwan
[2] Min Hwei Jr Coll Hlth Care Management, Dept Nursing, Tainan, Taiwan
[3] Natl Taiwan Univ, Sch Med, Taipei, Taiwan
[4] Ditmanson Med Fdn, Clin Data Ctr, Chiayi Christian Hosp, Dept Med Res, Chiayi, Taiwan
[5] Natl Chung Cheng Univ, Dept Informat Management, Minxiong Township, Chiayi County, Taiwan
[6] Natl Chung Cheng Univ, Inst Healthcare Informat Management, Minxiong Township, Chiayi County, Taiwan
[7] Natl Cent Univ, Dept Informat Management, Taoyuan, Taiwan
来源
FRONTIERS IN CARDIOVASCULAR MEDICINE | 2022年 / 9卷
关键词
atrial fibrillation; electronic health records; ischemic stroke; natural language processing; prediction; TRANSIENT ISCHEMIC ATTACK; TEXT CLASSIFICATION; FEATURE-SELECTION; VASCULAR EVENTS; STROKE CARE; SCORE; VALIDATION; RECURRENCE; PREDICTION; TAIWAN;
D O I
10.3389/fcvm.2022.941237
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
BackgroundTimely detection of atrial fibrillation (AF) after stroke is highly clinically relevant, aiding decisions on the optimal strategies for secondary prevention of stroke. In the context of limited medical resources, it is crucial to set the right priorities of extended heart rhythm monitoring by stratifying patients into different risk groups likely to have newly detected AF (NDAF). This study aimed to develop an electronic health record (EHR)-based machine learning model to assess the risk of NDAF in an early stage after stroke. MethodsLinked data between a hospital stroke registry and a deidentified research-based database including EHRs and administrative claims data was used. Demographic features, physiological measurements, routine laboratory results, and clinical free text were extracted from EHRs. The extreme gradient boosting algorithm was used to build the prediction model. The prediction performance was evaluated by the C-index and was compared to that of the AS5F and CHASE-LESS scores. ResultsThe study population consisted of a training set of 4,064 and a temporal test set of 1,492 patients. During a median follow-up of 10.2 months, the incidence rate of NDAF was 87.0 per 1,000 person-year in the test set. On the test set, the model based on both structured and unstructured data achieved a C-index of 0.840, which was significantly higher than those of the AS5F (0.779, p = 0.023) and CHASE-LESS (0.768, p = 0.005) scores. ConclusionsIt is feasible to build a machine learning model to assess the risk of NDAF based on EHR data available at the time of hospital admission. Inclusion of information derived from clinical free text can significantly improve the model performance and may outperform risk scores developed using traditional statistical methods. Further studies are needed to assess the clinical usefulness of the prediction model.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Using Natural Language Processing and Machine Learning to Identify Opioids in Electronic Health Record Data
    McDermott, Sean P.
    Wasan, Ajay D.
    JOURNAL OF PAIN RESEARCH, 2023, 16 : 2133 - 2140
  • [2] Natural Language Processing to Improve Prediction of Incident Atrial Fibrillation Using Electronic Health Records
    Ashburner, Jeffrey M.
    Chang, Yuchiao
    Wang, Xin
    Khurshid, Shaan
    Anderson, Christopher D.
    Dahal, Kumar
    Weisenfeld, Dana
    Cai, Tianrun
    Liao, Katherine P.
    Wagholikar, Kavishwar B.
    Murphy, Shawn N.
    Atlas, Steven J.
    Lubitz, Steven A.
    Singer, Daniel E.
    JOURNAL OF THE AMERICAN HEART ASSOCIATION, 2022, 11 (15):
  • [3] Identification of recurrent atrial fibrillation using natural language processing applied to electronic health records
    Zheng, Chengyi
    Lee, Ming-sum
    Bansal, Nisha
    Go, Alan S.
    Chen, Cheng
    Harrison, Teresa N.
    Fan, Dongjie
    Allen, Amanda
    Garcia, Elisha
    Lidgard, Ben
    Singer, Daniel
    An, Jaejin
    EUROPEAN HEART JOURNAL-QUALITY OF CARE AND CLINICAL OUTCOMES, 2024, 10 (01) : 77 - 88
  • [4] Validation of Risk Scores for Predicting Atrial Fibrillation Detected After Stroke Based on an Electronic Medical Record Algorithm: A Registry-Claims-Electronic Medical Record Linked Data Study
    Hsieh, Cheng-Yang
    Kao, Hsuan-Min
    Sung, Kuan-Lin
    Sposato, Luciano A.
    Sung, Sheng-Feng
    Lin, Swu-Jane
    FRONTIERS IN CARDIOVASCULAR MEDICINE, 2022, 9
  • [5] Using Natural Language Processing on Electronic Health Records to Enhance Detection and Prediction of Psychosis Risk
    Irving, Jessica
    Patel, Rashmi
    Oliver, Dominic
    Colling, Craig
    Pritchard, Megan
    Broadbent, Matthew
    Baldwin, Helen
    Stahl, Daniel
    Stewart, Robert
    Fusar-Poli, Paolo
    SCHIZOPHRENIA BULLETIN, 2021, 47 (02) : 405 - 414
  • [6] Identifying Goals of Care Conversations in the Electronic Health Record Using Natural Language Processing and Machine Learning
    Lee, Robert Y.
    Brumback, Lyndia C.
    Lober, William B.
    Sibley, James
    Nielsen, Elizabeth L.
    Treece, Patsy D.
    Kross, Erin K.
    Loggers, Elizabeth T.
    Fausto, James A.
    Lindvall, Charlotta
    Engelberg, Ruth A.
    Curtis, J. Randall
    JOURNAL OF PAIN AND SYMPTOM MANAGEMENT, 2021, 61 (01) : 136 - +
  • [7] Natural Language Processing and Machine Learning for Identifying Incident Stroke From Electronic Health Records: Algorithm Development and Validation
    Zhao, Yiqing
    Fu, Sunyang
    Bielinski, Suzette J.
    Decker, Paul A.
    Chamberlain, Alanna M.
    Roger, Veronique L.
    Liu, Hongfang
    Larson, Nicholas B.
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2021, 23 (03)
  • [8] Machine Learning, Natural Language Processing, and the Electronic Health Record: Innovations in Mental Health Services Research
    Edgcomb, Juliet Beni
    Zima, Bonnie
    PSYCHIATRIC SERVICES, 2019, 70 (04) : 346 - 349
  • [9] Automated construction contract analysis for risk and responsibility assessment using natural language processing and machine learning
    Dikmen, Irem
    Eken, Gorkem
    Erol, Huseyin
    Birgonul, M. Talat
    COMPUTERS IN INDUSTRY, 2025, 166
  • [10] Using Artificial Intelligence With Natural Language Processing to Combine Electronic Health Record's Structured and Free Text Data to Identify Nonvalvular Atrial Fibrillation to Decrease Strokes and Death: Evaluation and Case-Control Study
    Elkin, Peter L.
    Mullin, Sarah
    Mardekian, Jack
    Crowner, Christopher
    Sakilay, Sylvester
    Sinha, Shyamashree
    Brady, Gary
    Wright, Marcia
    Nolen, Kimberly
    Trainer, JoAnn
    Koppel, Ross
    Schlegel, Daniel
    Kaushik, Sashank
    Zhao, Jane
    Song, Buer
    Anand, Edwin
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2021, 23 (11)