2018 n2c2 shared task on adverse drug events and medication extraction in electronic health records

被引:168
作者
Henry, Sam [1 ]
Buchan, Kevin [2 ]
Filannino, Michele [1 ,3 ]
Stubbs, Amber [4 ]
Uzuner, Ozlem [1 ,3 ,5 ]
机构
[1] George Mason Univ, Dept Informat Sci & Technol, 4400 Univ Dr, Fairfax, VA 22030 USA
[2] SUNY Albany, Dept Informat Sci, Albany, NY 12222 USA
[3] MIT, Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[4] Simmons Univ, Dept Math & Comp Sci, Boston, MA USA
[5] Harvard Med Sch, Dept Biomed Informat, Boston, MA 02115 USA
基金
美国国家卫生研究院;
关键词
OF-THE-ART; CLINICAL NARRATIVES; DE-IDENTIFICATION; HEART-DISEASE; RISK-FACTORS; INFORMATION;
D O I
10.1093/jamia/ocz166
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective: This article summarizes the preparation, organization, evaluation, and results of Track 2 of the 2018 National NLP Clinical Challenges shared task. Track 2 focused on extraction of adverse drug events (ADEs) from clinical records and evaluated 3 tasks: concept extraction, relation classification, and end-to-end systems. We perform an analysis of the results to identify the state of the art in these tasks, learn from it, and build on it. Materials and Methods: For all tasks, teams were given raw text of narrative discharge summaries, and in all the tasks, participants proposed deep learning-based methods with hand-designed features. In the concept extraction task, participants used sequence labelling models (bidirectional long short-term memory being the most popular), whereas in the relation classification task, they also experimented with instance-based classifiers (namely support vector machines and rules). Ensemble methods were also popular. Results: A total of 28 teams participated in task 1, with 21 teams in tasks 2 and 3. The best performing systems set a high performance bar with F1 scores of 0.9418 for concept extraction, 0.9630 for relation classification, and 0.8905 for end-to-end. However, the results were much lower for concepts and relations of Reasons and ADEs. These were often missed because local context is insufficient to identify them. Conclusions: This challenge shows that clinical concept extraction and relation classification systems have a high performance for many concept types, but significant improvement is still required for ADEs and Reasons. Incorporating the larger context or outside knowledge will likely improve the performance of future systems.
引用
收藏
页码:3 / 12
页数:10
相关论文
共 55 条
[1]  
[Anonymous], 2016, TREC
[2]  
[Anonymous], 2017, Training
[3]  
[Anonymous], 1989, Computer Intensive Methods for Hypothesis Testing: An Introduction
[4]   prFood: Ontology principles for provenance and risk in the food domain [J].
Batlajery, Belfrit Victor ;
Weal, Mark ;
Chapman, Adriane ;
Moreau, Luc .
2018 IEEE 12TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2018, :17-24
[5]  
Bethard S., 2016, P 10 INT WORKSHOP SE, P1052, DOI DOI 10.18653/V1/S16-1165
[6]  
Bethard S., 2017, P 11 INT WORKSH SEM
[7]  
Bethard Steven., 2015, P 9 INT WORKSHOP SEM, P806
[8]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[9]   Adverse drug events and medication relation extraction in electronic health records with ensemble deep learning methods [J].
Christopoulou, Fenia ;
Thy Thy Tran ;
Sahu, Sunil Kumar ;
Miwa, Makoto ;
Ananiadou, Sophia .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2020, 27 (01) :39-46
[10]   Adverse drug event and medication extraction in electronic health records via a cascading architecture with different sequence labeling models and word embeddings [J].
Dai, Hong-Jie ;
Su, Chu-Hsien ;
Wu, Chi-Shin .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2020, 27 (01) :47-55