Contextualized medication information extraction using Transformer-based deep learning architectures

被引：11

作者：

Chen, Aokun ^{[1
,2
]}

Yu, Zehao ^{[1
]}

Yang, Xi ^{[1
]}

Guo, Yi ^{[1
,2
]}

Bian, Jiang ^{[1
,2
]}

Wu, Yonghui ^{[1
,2
,3
]}

机构：

[1] Univ Florida, Coll Med, Dept Hlth Outcomes & Biomed Informat, Gainesville, FL USA

[2] Univ Florida, Canc Informat Shared Resource, Hlth Canc Ctr, Gainesville, FL USA

[3] Clin & Translat Res Bldg,2004 Mowry Rd,POB 100177, Gainesville, FL 32610 USA

来源：

JOURNAL OF BIOMEDICAL INFORMATICS | 2023年 / 142卷

关键词：

Medication information extraction; Named entity recognition; Text classification; Deep learning; Clinical natural language processing; SYSTEM;

D O I：

10.1016/j.jbi.2023.104370

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Objective: To develop a natural language processing (NLP) system to extract medications and contextual infor-mation that help understand drug changes. This project is part of the 2022 n2c2 challenge. Materials and methods: We developed NLP systems for medication mention extraction, event classification (indicating medication changes discussed or not), and context classification to classify medication changes context into 5 orthogonal dimensions related to drug changes. We explored 6 state-of-the-art pretrained trans-former models for the three subtasks, including GatorTron, a large language model pretrained using > 90 billion words of text (including > 80 billion words from > 290 million clinical notes identified at the University of Florida Health). We evaluated our NLP systems using annotated data and evaluation scripts provided by the 2022 n2c2 organizers. Results: Our GatorTron models achieved the best F1-scores of 0.9828 for medication extraction (ranked 3rd), 0.9379 for event classification (ranked 2nd), and the best micro-average accuracy of 0.9126 for context classi-fication. GatorTron outperformed existing transformer models pretrained using smaller general English text and clinical text corpora, indicating the advantage of large language models. Conclusion: This study demonstrated the advantage of using large transformer models for contextual medication information extraction from clinical narratives.

引用

页数：6

共 30 条

[1]

ALBERT, US

[2]

Alsentzer Emily, 2019, P 2 CLIN NATURAL LAN, DOI DOI 10.18653/V1/W19-1909

[3]

[Anonymous], GPT 3 ITS NAT SCOP L, DOI [10.1007/s11023-020-09548-1, DOI 10.1007/S11023-020-09548-1]

[4] Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010 [J].

de Bruijn, Berry ;

Cherry, Colin ;

Kiritchenko, Svetlana ;

Martin, Joel ;

Zhu, Xiaodan .

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2011, 18 (05) :557-562

[5]

Devlin J, 2019, Arxiv, DOI [arXiv:1810.04805, 10.48550/arXiv.1810.04805]

[6]

GatorTron-S | NVIDIA NGC, NVIDIA NGC CAT

[7]

Hahn Udo, 2020, Yearb Med Inform, V29, P208, DOI 10.1055/s-0040-1702001

[8] 2018 n2c2 shared task on adverse drug events and medication extraction in electronic health records [J].

Henry, Sam ;

Buchan, Kevin ;

Filannino, Michele ;

Stubbs, Amber ;

Uzuner, Ozlem .

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2020, 27 (01) :3-12

[9]

Jagannatha Abhyuday N, 2016, Proc Conf, V2016, P473

[10]

Jiang Min, 2014, AMIA Jt Summits Transl Sci Proc, V2014, P37

← 1 2 3 →