Identifying relations of medications with adverse drug events using recurrent convolutional neural networks and gradient boosting

被引:45
作者
Yang, Xi [1 ]
Bian, Jiang [1 ]
Fang, Ruogu [2 ]
Bjarnadottir, Ragnhildur, I [3 ]
Hogan, William R. [1 ]
Wu, Yonghui [1 ]
机构
[1] Univ Florida, Coll Med, Dept Hlth Outcomes & Biomed Informat, Gainesville, FL USA
[2] Univ Florida, J Crayton Pruitt Family Dept Biomed Engn, Gainesville, FL USA
[3] Univ Florida, Coll Nursing, Dept Family Community & Hlth Syst Sci, Gainesville, FL 32611 USA
关键词
named entity recognition; relation extraction; recurrent convolutional neural network; deep learning; clinical natural language processing; CLINICAL INFORMATION EXTRACTION; OF-THE-ART; SYSTEM; RECOGNITION; ASSERTIONS; ENTITIES;
D O I
10.1093/jamia/ocz144
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective: To develop a natural language processing system that identifies relations of medications with adverse drug events from clinical narratives. This project is part of the 2018 n2c2 challenge. Materials and Methods: We developed a novel clinical named entity recognition method based on an recurrent convolutional neural network and compared it to a recurrent neural network implemented using the long-short term memory architecture, explored methods to integrate medical knowledge as embedding layers in neural networks, and investigated 3 machine learning models, including support vector machines, random forests and gradient boosting for relation classification. The performance of our system was evaluated using annotated data and scripts provided by the 2018 n2c2 organizers. Results: Our system was among the top ranked. Our best model submitted during this challenge (based on recurrent neural networks and support vector machines) achieved lenient F1 scores of 0.9287 for concept extraction (ranked third), 0.9459 for relation classification (ranked fourth), and 0.8778 for the end-to-end relation extraction (ranked second). We developed a novel named entity recognition model based on a recurrent convolutional neural network and further investigated gradient boosting for relation classification. The new methods improved the lenient F1 scores of the 3 subtasks to 0.9292, 0.9633, and 0.8880, respectively, which are comparable to the best performance reported in this challenge. Conclusion: This study demonstrated the feasibility of using machine learning methods to extract the relations of medications with adverse drug events from clinical narratives.
引用
收藏
页码:65 / 72
页数:8
相关论文
共 50 条
[1]   Balanced undersampling: a novel sentence-based undersampling method to improve recognition of named entities in chemical and biomedical text [J].
Akkasi, Abbas ;
Varoglu, Ekrem ;
Dimililer, Nazife .
APPLIED INTELLIGENCE, 2018, 48 (08) :1965-1978
[2]  
[Anonymous], 2016, P NAACL HLT
[3]  
[Anonymous], 2017, CoRR
[4]  
[Anonymous], 2013, NIPS
[5]  
[Anonymous], 2000, ERR IS HUMAN BUILDIN
[6]  
[Anonymous], 2015, TENSOR
[7]  
[Anonymous], BMC MED INFORM DECIS
[8]  
[Anonymous], ENERGIES, DOI DOI 10.1007/510018-017-0192-7
[9]  
[Anonymous], 2014, PROC C EMPIRICAL MET
[10]  
[Anonymous], 2011, Acm T. Intel. Syst. Tec., DOI DOI 10.1145/1961189.1961199