Improving the accuracy of automated gout flare ascertainment using natural language processing of electronic health records and linked Medicare claims data

被引：2

作者：

Yoshida, Kazuki ^{[1
,2
]}

Cai, Tianrun ^{[1
,2
]}

Bessette, Lily G. ^{[3
]}

Kim, Erin ^{[3
]}

Lee, Su Been ^{[3
]}

Zabotka, Luke E. ^{[3
]}

Sun, Alec ^{[3
]}

Mastrorilli, Julianna M. ^{[3
]}

Oduol, Theresa A. ^{[3
]}

Liu, Jun ^{[3
]}

Solomon, Daniel H. ^{[1
,2
,3
]}

Kim, Seoyoung C. ^{[1
,2
,3
]}

Desai, Rishi J. ^{[2
,3
]}

Liao, Katherine P. ^{[1
,2
,4
]}

机构：

[1] Brigham & Womens Hosp, Dept Med, Div Rheumatol Inflammat & Immun, 75 Francis St, Boston, MA 02115 USA

[2] Harvard Med Sch, Dept Med, Boston, MA 02115 USA

[3] Brigham & Womens Hosp, Dept Med, Div Pharmacoepidemiol & Pharmacoecon, 75 Francis St, Boston, MA 02115 USA

[4] Harvard Med Sch, Dept Biomed Informat, Boston, MA 02115 USA

来源：

PHARMACOEPIDEMIOLOGY AND DRUG SAFETY | 2024年 / 33卷 / 01期

关键词：

gout; natural language processing; AMERICAN-COLLEGE; VALIDATION; DEFINITION;

D O I：

10.1002/pds.5684

中图分类号：

R1 [预防医学、卫生学];

学科分类号：

1004 ; 120402 ;

摘要：

Background: We aimed to determine whether integrating concepts from the notes from the electronic health record (EHR) data using natural language processing (NLP) could improve the identification of gout flares. Methods: Using Medicare claims linked with EHR, we selected gout patients who initiated the urate-lowering therapy (ULT). Patients' 12-month baseline period and on treatment follow-up were segmented into 1-month units. We retrieved EHR notes for months with gout diagnosis codes and processed notes for NLP concepts. We selected a random sample of 500 patients and reviewed each of their notes for the presence of a physician-documented gout flare. Months containing at least 1 note mentioning gout flares were considered months with events. We used 60% of patients to train predictive models with LASSO. We evaluated the models by the area under the curve (AUC) in the validation data and examined positive/negative predictive values (P/NPV). Results: We extracted and labeled 839 months of follow-up (280 with gout flares). The claims-only model selected 20 variables (AUC = 0.69). The NLP concept-only model selected 15 (AUC = 0.69). The combined model selected 32 claims variables and 13 NLP concepts (AUC = 0.73). The claims-only model had a PPV of 0.64 [0.50, 0.77] and an NPV of 0.71 [0.65, 0.76], whereas the combined model had a PPV of 0.76 [0.61, 0.88] and an NPV of 0.71 [0.65, 0.76]. Conclusion: Adding NLP concept variables to claims variables resulted in a small improvement in the identification of gout flares. Our data-driven claims-only model and our combined claims/NLP-concept model outperformed existing rule-based claims algorithms reliant on medication use, diagnosis, and procedure codes.

引用

页数：9

共 50 条

[41] Comparing natural language processing representations of coded disease sequences for prediction in electronic health records
Beaney, Thomas
Jha, Sneha
Alaa, Asem
Smith, Alexander
Clarke, Jonathan
Woodcock, Thomas
Majeed, Azeem
Aylin, Paul
Barahona, Mauricio
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (07) : 1451 - 1462
[42] Development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records
Mengge Zhao
James Havrilla
Jacqueline Peng
Madison Drye
Maddie Fecher
Whitney Guthrie
Birkan Tunc
Robert Schultz
Kai Wang
Yunyun Zhou
Journal of Neurodevelopmental Disorders, 2022, 14
[43] Using an artificial intelligence tool incorporating natural language processing to identify patients with a diagnosis of ANCA-associated vasculitis in electronic health records
van Leeuwen, Jolijn R.
Penne, Erik L.
Rabelink, Ton
Knevel, Rachel
Teng, Y. K. Onno
COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 168
[44] Natural Language Processing-Enabled and Conventional Data Capture Methods for Input to Electronic Health Records: A Comparative Usability Study
Kaufman, David R.
Sheehan, Barbara
Stetson, Peter
Bhatt, Ashish R.
Field, Adele I.
Patel, Chirag
Maisel, James Mark
JMIR MEDICAL INFORMATICS, 2016, 4 (04) : 21 - 37
[45] Using Natural Language Processing and Machine Learning to Identify Gout Flares From Electronic Clinical Notes
Zheng, Chengyi
Rashid, Nazia
Wu, Yi-Lin
Koblick, River
Lin, Antony T.
Levy, Gerald D.
Cheetham, T. Craig
ARTHRITIS CARE & RESEARCH, 2014, 66 (11) : 1740 - 1748
[46] Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing
Han, Sifei
Zhang, Robert F.
Shi, Lingyun
Richie, Russell
Liu, Haixia
Tseng, Andrew
Quan, Wei
Ryan, Neal
Brent, David
Tsui, Fuchiang R.
JOURNAL OF BIOMEDICAL INFORMATICS, 2022, 127
[47] Identifying Suicidal Adolescents from Mental Health Records Using Natural Language Processing
Velupillai, Sumithra
Epstein, Sophie
Bittar, Andre
Stephenson, Thomas
Dutta, Rina
Downs, Johnny
MEDINFO 2019: HEALTH AND WELLBEING E-NETWORKS FOR ALL, 2019, 264 : 413 - 417
[48] Study of the Drug-related Adverse Events with the Help of Electronic Health Records and Natural Language Processing
Allabun, Sarah
Soufiene, Ben Othman
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (06) : 1405 - 1410
[49] Machine Learning and Natural Language Processing to Improve Classification of Atrial Septal Defects in Electronic Health Records
Guo, Yuting
Shi, Haoming
Book, Wendy M.
Ivey, Lindsey Carrie
Rodriguez, Fred H.
Sameni, Reza
Raskind-Hood, Cheryl
Robichaux, Chad
Downing, Karrie F.
Sarker, Abeed
BIRTH DEFECTS RESEARCH, 2025, 117 (03):
[50] Distinguishing cardiac catheter ablation energy modalities by applying natural language processing to electronic health records
Margetta, Jamie
Sale, Alicia
JOURNAL OF COMPARATIVE EFFECTIVENESS RESEARCH, 2024, 13 (03)

← 1 2 3 4 5 →