Deep learning with word embeddings improves biomedical named entity recognition

被引:335
|
作者
Habibi, Maryam [1 ]
Weber, Leon [1 ]
Neves, Mariana [2 ]
Wiegandt, David Luis [1 ]
Leser, Ulf [1 ]
机构
[1] Humboldt Univ, Dept Comp Sci, D-10099 Berlin, Germany
[2] Hasso Plattner Inst, Enterprise Platform & Integrat Concepts, D-14482 Potsdam, Germany
关键词
D O I
10.1093/bioinformatics/btx228
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Text mining has become an important tool for biomedical research. The most fundamental text-mining task is the recognition of biomedical named entities (NER), such as genes, chemicals and diseases. Current NER methods rely on pre-defined features which try to capture the specific surface properties of entity types, properties of the typical local context, background knowledge, and linguistic information. State-of-the-art tools are entity-specific, as dictionaries and empirically optimal feature sets differ between entity types, which makes their development costly. Furthermore, features are often optimized for a specific gold standard corpus, which makes extrapolation of quality measures difficult. Results: We show that a completely generic method based on deep learning and statistical word embeddings [called long short-term memory network-conditional random field (LSTM-CRF)] outperforms state-of-the-art entity-specific NER tools, and often by a large margin. To this end, we compared the performance of LSTM-CRF on 33 data sets covering five different entity classes with that of best-of-class NER tools and an entity-agnostic CRF implementation. On average, F1-score of LSTM-CRF is 5% above that of the baselines, mostly due to a sharp increase in recall.
引用
收藏
页码:I37 / I48
页数:12
相关论文
共 50 条
  • [21] Deep learning methods for biomedical named entity recognition: a survey and qualitative comparison
    Song, Bosheng
    Li, Fen
    Liu, Yuansheng
    Zeng, Xiangxiang
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (06)
  • [22] Improving deep learning method for biomedical named entity recognition by using entity definition information
    Ying Xiong
    Shuai Chen
    Buzhou Tang
    Qingcai Chen
    Xiaolong Wang
    Jun Yan
    Yi Zhou
    BMC Bioinformatics, 22
  • [23] Improving deep learning method for biomedical named entity recognition by using entity definition information
    Xiong, Ying
    Chen, Shuai
    Tang, Buzhou
    Chen, Qingcai
    Wang, Xiaolong
    Yan, Jun
    Zhou, Yi
    BMC BIOINFORMATICS, 2021, 22 (SUPPL 1)
  • [24] A Survey on Deep Learning for Named Entity Recognition
    Li, Jing
    Sun, Aixin
    Han, Jianglei
    Li, Chenliang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (01) : 50 - 70
  • [25] Named entity recognition based on deep learning
    Ji Z.
    Kong D.
    Liu W.
    Dong W.
    Sang Y.
    Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2022, 28 (06): : 1603 - 1615
  • [26] Turkish Named Entity Recognition with Deep Learning
    Gunes, Asim
    Tantug, A. Cuneyd
    2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,
  • [27] Word Embeddings for Unsupervised Named Entity Linking
    Nozza, Debora
    Sas, Cezar
    Fersini, Elisabetta
    Messina, Enza
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT II, 2019, 11776 : 115 - 132
  • [28] Deep learning for named entity recognition: a survey
    Hu Z.
    Hou W.
    Liu X.
    Neural Comput. Appl., 16 (8995-9022): : 8995 - 9022
  • [29] A Deep Learning Solution to Named Entity Recognition
    Murthy, V. Rudra
    Bhattacharyya, Pushpak
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, (CICLING 2016), PT I, 2018, 9623 : 427 - 438
  • [30] A Multichannel Biomedical Named Entity Recognition Model Based on Multitask Learning and Contextualized Word Representations
    Wei, Hao
    Gao, Mingyuan
    Zhou, Ai
    Chen, Fei
    Qu, Wen
    Zhang, Yijia
    Lu, Mingyu
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2020, 2020