Accurate Clinical and Biomedical Named Entity Recognition at Scale

被引:19
作者
Kocaman, Veysel [1 ]
Talby, David [1 ]
机构
[1] John Snow Labs Inc, 16192 Coastal Highway, Lewes, DE 19958 USA
关键词
Spark; Natural language processing; Named entity recognition; Medical texts; Biomedical NER;
D O I
10.1016/j.simpa.2022.100373
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We introduce an agile, production-grade clinical and biomedical Named entity recognition (NER) algorithm based on a modified BiLSTM-CNN-Char DL architecture built on top of Apache Spark. Our NER implementation establishes new state-of-the-art accuracy on 7 of 8 well-known biomedical NER benchmarks and 3 clinical concept extraction challenges: 2010 i2b2/VA clinical concept extraction, 2014 n2c2 de-identification, and 2018 n2c2 medication extraction. Moreover, clinical NER models trained using this implementation outperform the accuracy of commercial entity extraction solutions, AWS Medical Comprehend and Google Cloud Healthcare API by a large margin (8.9% and 6.7% respectively), without using memory-intensive language models.
引用
收藏
页数:7
相关论文
共 53 条
[1]  
Agarwal K., 2021, PREPARING NEXT PANDE
[2]  
Akbik A., 2018, P 27 INT C COMP LING, P1638
[3]  
Alsentzer E, 2019, Arxiv, DOI [arXiv:1904.03323, DOI 10.48550/ARXIV.1904.03323]
[4]  
[Anonymous], 2016, T ASSOC COMPUT LING, DOI DOI 10.1162/TACLA00104
[5]   An overview of MetaMap: historical perspective and recent advances [J].
Aronson, Alan R. ;
Lang, Francois-Michel .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2010, 17 (03) :229-236
[6]  
Arora S., 2020, arXiv, DOI DOI 10.48550/ARXIV.2005.09117
[7]  
Beltagy I, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P3615
[8]  
Bhatia Parminder, 2019, 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA), P1844, DOI 10.1109/ICMLA.2019.00297
[9]  
Bhatia P, 2020, Arxiv, DOI arXiv:1812.05270
[10]  
Choudhury S., TRACKING EVOLUTION C