Named Entity Recognition From Biomedical Texts Using a Fusion Attention-Based BiLSTM-CRF

被引:34
作者
Wei, Hao [1 ]
Gao, Mingyuan [1 ]
Zhou, Ai [1 ]
Chen, Fei [1 ]
Qu, Wen [1 ]
Wang, Chunli [1 ]
Lu, Mingyu [1 ]
机构
[1] Dalian Maritime Univ, Informat Sci & Technol Coll, Dalian 116026, Peoples R China
关键词
Biomedical text; named entity recognition; attention mechanism; long short-term memory; conditional random field;
D O I
10.1109/ACCESS.2019.2920734
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Biomedical named entity recognition (BNER) is the basis of biomedical text mining and one of the core sub-tasks of information extraction. Previous BNER models based on conventional machine learning rely on time-consuming feature engineering. Though most neural network methods improve the problems with automatic learning, they cannot pay attention to the significant areas when capturing features. In this paper, we propose an attention-based BiLSTM-CRF model. First, this model adopts a bidirectional long short-term memory network (BiLSTM) to obtain more complete context information. At the same time, the attention mechanism is proposed to improve the vector representations in BiLSTM. We design different attention weight redistribution methods and fuse them. It effectively prevents the significant information loss when extracting features. Finally, combining BiLSTM with conditional random field (CRF) layer effectively solves the problems of the inability to handle the strong dependence of tags in the sequence. With the simple architecture, our model achieves a reasonable performance on the JNLPBA corpus. It obtains an Fl-score of 73.50. Our model can enhance the ability of the neural network to extract significant information and does not rely on any feature engineering, with only general pre-training word vectors. It makes our model have high portability and extendibility.
引用
收藏
页码:73627 / 73636
页数:10
相关论文
共 50 条
[31]  
Luong T., 2015, P 2015 C EMP METH NA, P1412, DOI [DOI 10.18653/V1/D15-1166, 10.18653]
[32]   Long short-term memory RNN for biomedical named entity recognition [J].
Lyu, Chen ;
Chen, Bo ;
Ren, Yafeng ;
Ji, Donghong .
BMC BIOINFORMATICS, 2017, 18
[33]   Audio-visual emotion fusion (AVEF): A deep efficient weighted approach [J].
Ma, Yaxiong ;
Hao, Yixue ;
Chen, Min ;
Chen, Jincai ;
Lu, Ping ;
Kosir, Andrej .
INFORMATION FUSION, 2019, 46 :184-192
[34]   Improving RNN with Attention and Embedding for Adverse Drug Reactions [J].
Pandey, Chandra ;
Ibrahim, Zina ;
Wu, Honghan ;
Iqbal, Ehtesham ;
Dobson, Richard .
PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON DIGITAL HEALTH (DH'17), 2017, :67-71
[35]  
Pennington J, 2014, P 2014 C EMP METH NA, V2014, P1532, DOI DOI 10.3115/V1/D14-1162
[36]  
Ratinov L., 2009, P 13 C COMP NAT LANG, P147
[37]   ABNER: an open source tool for automatically tagging genes, proteins and other entity names in text [J].
Settles, B .
BIOINFORMATICS, 2005, 21 (14) :3191-3192
[38]  
Song Y., 2004, P INT JOINT WORKSHOP, P100
[39]  
Srivastava N, 2014, J MACH LEARN RES, V15, P1929
[40]  
Subramanian S., 2016, P NAACL HLT