Contextualized medication information extraction using Transformer-based deep learning architectures

被引:11
作者
Chen, Aokun [1 ,2 ]
Yu, Zehao [1 ]
Yang, Xi [1 ]
Guo, Yi [1 ,2 ]
Bian, Jiang [1 ,2 ]
Wu, Yonghui [1 ,2 ,3 ]
机构
[1] Univ Florida, Coll Med, Dept Hlth Outcomes & Biomed Informat, Gainesville, FL USA
[2] Univ Florida, Canc Informat Shared Resource, Hlth Canc Ctr, Gainesville, FL USA
[3] Clin & Translat Res Bldg,2004 Mowry Rd,POB 100177, Gainesville, FL 32610 USA
关键词
Medication information extraction; Named entity recognition; Text classification; Deep learning; Clinical natural language processing; SYSTEM;
D O I
10.1016/j.jbi.2023.104370
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Objective: To develop a natural language processing (NLP) system to extract medications and contextual infor-mation that help understand drug changes. This project is part of the 2022 n2c2 challenge. Materials and methods: We developed NLP systems for medication mention extraction, event classification (indicating medication changes discussed or not), and context classification to classify medication changes context into 5 orthogonal dimensions related to drug changes. We explored 6 state-of-the-art pretrained trans-former models for the three subtasks, including GatorTron, a large language model pretrained using > 90 billion words of text (including > 80 billion words from > 290 million clinical notes identified at the University of Florida Health). We evaluated our NLP systems using annotated data and evaluation scripts provided by the 2022 n2c2 organizers. Results: Our GatorTron models achieved the best F1-scores of 0.9828 for medication extraction (ranked 3rd), 0.9379 for event classification (ranked 2nd), and the best micro-average accuracy of 0.9126 for context classi-fication. GatorTron outperformed existing transformer models pretrained using smaller general English text and clinical text corpora, indicating the advantage of large language models. Conclusion: This study demonstrated the advantage of using large transformer models for contextual medication information extraction from clinical narratives.
引用
收藏
页数:6
相关论文
共 30 条
[1]  
ALBERT, US
[2]  
Alsentzer Emily, 2019, P 2 CLIN NATURAL LAN, DOI DOI 10.18653/V1/W19-1909
[3]  
[Anonymous], GPT 3 ITS NAT SCOP L, DOI [10.1007/s11023-020-09548-1, DOI 10.1007/S11023-020-09548-1]
[4]   Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010 [J].
de Bruijn, Berry ;
Cherry, Colin ;
Kiritchenko, Svetlana ;
Martin, Joel ;
Zhu, Xiaodan .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2011, 18 (05) :557-562
[5]  
Devlin J, 2019, Arxiv, DOI [arXiv:1810.04805, 10.48550/arXiv.1810.04805]
[6]  
GatorTron-S | NVIDIA NGC, NVIDIA NGC CAT
[7]  
Hahn Udo, 2020, Yearb Med Inform, V29, P208, DOI 10.1055/s-0040-1702001
[8]   2018 n2c2 shared task on adverse drug events and medication extraction in electronic health records [J].
Henry, Sam ;
Buchan, Kevin ;
Filannino, Michele ;
Stubbs, Amber ;
Uzuner, Ozlem .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2020, 27 (01) :3-12
[9]  
Jagannatha Abhyuday N, 2016, Proc Conf, V2016, P473
[10]  
Jiang Min, 2014, AMIA Jt Summits Transl Sci Proc, V2014, P37