Improving the accuracy of automated gout flare ascertainment using natural language processing of electronic health records and linked Medicare claims data

被引:2
|
作者
Yoshida, Kazuki [1 ,2 ]
Cai, Tianrun [1 ,2 ]
Bessette, Lily G. [3 ]
Kim, Erin [3 ]
Lee, Su Been [3 ]
Zabotka, Luke E. [3 ]
Sun, Alec [3 ]
Mastrorilli, Julianna M. [3 ]
Oduol, Theresa A. [3 ]
Liu, Jun [3 ]
Solomon, Daniel H. [1 ,2 ,3 ]
Kim, Seoyoung C. [1 ,2 ,3 ]
Desai, Rishi J. [2 ,3 ]
Liao, Katherine P. [1 ,2 ,4 ]
机构
[1] Brigham & Womens Hosp, Dept Med, Div Rheumatol Inflammat & Immun, 75 Francis St, Boston, MA 02115 USA
[2] Harvard Med Sch, Dept Med, Boston, MA 02115 USA
[3] Brigham & Womens Hosp, Dept Med, Div Pharmacoepidemiol & Pharmacoecon, 75 Francis St, Boston, MA 02115 USA
[4] Harvard Med Sch, Dept Biomed Informat, Boston, MA 02115 USA
关键词
gout; natural language processing; AMERICAN-COLLEGE; VALIDATION; DEFINITION;
D O I
10.1002/pds.5684
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Background: We aimed to determine whether integrating concepts from the notes from the electronic health record (EHR) data using natural language processing (NLP) could improve the identification of gout flares. Methods: Using Medicare claims linked with EHR, we selected gout patients who initiated the urate-lowering therapy (ULT). Patients' 12-month baseline period and on treatment follow-up were segmented into 1-month units. We retrieved EHR notes for months with gout diagnosis codes and processed notes for NLP concepts. We selected a random sample of 500 patients and reviewed each of their notes for the presence of a physician-documented gout flare. Months containing at least 1 note mentioning gout flares were considered months with events. We used 60% of patients to train predictive models with LASSO. We evaluated the models by the area under the curve (AUC) in the validation data and examined positive/negative predictive values (P/NPV). Results: We extracted and labeled 839 months of follow-up (280 with gout flares). The claims-only model selected 20 variables (AUC = 0.69). The NLP concept-only model selected 15 (AUC = 0.69). The combined model selected 32 claims variables and 13 NLP concepts (AUC = 0.73). The claims-only model had a PPV of 0.64 [0.50, 0.77] and an NPV of 0.71 [0.65, 0.76], whereas the combined model had a PPV of 0.76 [0.61, 0.88] and an NPV of 0.71 [0.65, 0.76]. Conclusion: Adding NLP concept variables to claims variables resulted in a small improvement in the identification of gout flares. Our data-driven claims-only model and our combined claims/NLP-concept model outperformed existing rule-based claims algorithms reliant on medication use, diagnosis, and procedure codes.
引用
收藏
页数:9
相关论文
共 50 条
  • [41] Comparing natural language processing representations of coded disease sequences for prediction in electronic health records
    Beaney, Thomas
    Jha, Sneha
    Alaa, Asem
    Smith, Alexander
    Clarke, Jonathan
    Woodcock, Thomas
    Majeed, Azeem
    Aylin, Paul
    Barahona, Mauricio
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (07) : 1451 - 1462
  • [42] Development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records
    Mengge Zhao
    James Havrilla
    Jacqueline Peng
    Madison Drye
    Maddie Fecher
    Whitney Guthrie
    Birkan Tunc
    Robert Schultz
    Kai Wang
    Yunyun Zhou
    Journal of Neurodevelopmental Disorders, 2022, 14
  • [43] Using an artificial intelligence tool incorporating natural language processing to identify patients with a diagnosis of ANCA-associated vasculitis in electronic health records
    van Leeuwen, Jolijn R.
    Penne, Erik L.
    Rabelink, Ton
    Knevel, Rachel
    Teng, Y. K. Onno
    COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 168
  • [44] Natural Language Processing-Enabled and Conventional Data Capture Methods for Input to Electronic Health Records: A Comparative Usability Study
    Kaufman, David R.
    Sheehan, Barbara
    Stetson, Peter
    Bhatt, Ashish R.
    Field, Adele I.
    Patel, Chirag
    Maisel, James Mark
    JMIR MEDICAL INFORMATICS, 2016, 4 (04) : 21 - 37
  • [45] Using Natural Language Processing and Machine Learning to Identify Gout Flares From Electronic Clinical Notes
    Zheng, Chengyi
    Rashid, Nazia
    Wu, Yi-Lin
    Koblick, River
    Lin, Antony T.
    Levy, Gerald D.
    Cheetham, T. Craig
    ARTHRITIS CARE & RESEARCH, 2014, 66 (11) : 1740 - 1748
  • [46] Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing
    Han, Sifei
    Zhang, Robert F.
    Shi, Lingyun
    Richie, Russell
    Liu, Haixia
    Tseng, Andrew
    Quan, Wei
    Ryan, Neal
    Brent, David
    Tsui, Fuchiang R.
    JOURNAL OF BIOMEDICAL INFORMATICS, 2022, 127
  • [47] Identifying Suicidal Adolescents from Mental Health Records Using Natural Language Processing
    Velupillai, Sumithra
    Epstein, Sophie
    Bittar, Andre
    Stephenson, Thomas
    Dutta, Rina
    Downs, Johnny
    MEDINFO 2019: HEALTH AND WELLBEING E-NETWORKS FOR ALL, 2019, 264 : 413 - 417
  • [48] Study of the Drug-related Adverse Events with the Help of Electronic Health Records and Natural Language Processing
    Allabun, Sarah
    Soufiene, Ben Othman
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (06) : 1405 - 1410
  • [49] Machine Learning and Natural Language Processing to Improve Classification of Atrial Septal Defects in Electronic Health Records
    Guo, Yuting
    Shi, Haoming
    Book, Wendy M.
    Ivey, Lindsey Carrie
    Rodriguez, Fred H.
    Sameni, Reza
    Raskind-Hood, Cheryl
    Robichaux, Chad
    Downing, Karrie F.
    Sarker, Abeed
    BIRTH DEFECTS RESEARCH, 2025, 117 (03):
  • [50] Distinguishing cardiac catheter ablation energy modalities by applying natural language processing to electronic health records
    Margetta, Jamie
    Sale, Alicia
    JOURNAL OF COMPARATIVE EFFECTIVENESS RESEARCH, 2024, 13 (03)