ChemSpot: a hybrid system for chemical named entity recognition

被引:161
|
作者
Rocktaschel, Tim [1 ]
Weidlich, Michael [1 ]
Leser, Ulf [1 ]
机构
[1] Humboldt Univ, Dept Comp Sci, D-12489 Berlin, Germany
关键词
BIOMEDICAL TEXT; RECONSTRUCTION; IDENTIFICATION; DICTIONARY;
D O I
10.1093/bioinformatics/bts183
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The accurate identification of chemicals in text is important for many applications, including computer-assisted reconstruction of metabolic networks or retrieval of information about substances in drug development. But due to the diversity of naming conventions and traditions for such molecules, this task is highly complex and should be supported by computational tools. Results: We present ChemSpot, a named entity recognition (NER) tool for identifying mentions of chemicals in natural language texts, including trivial names, drugs, abbreviations, molecular formulas and International Union of Pure and Applied Chemistry entities. Since the different classes of relevant entities have rather different naming characteristics, ChemSpot uses a hybrid approach combining a Conditional Random Field with a dictionary. It achieves an F-1 measure of 68.1% on the SCAI corpus, outperforming the only other freely available chemical NER tool, OSCAR4, by 10.8 percentage points.
引用
收藏
页码:1633 / 1640
页数:8
相关论文
共 50 条
  • [31] Hybrid PSO-ANFIS for Speaker Recognition
    Silarbi, Samiya
    Tlemsani, Redouane
    Bendahmane, Abderrahmane
    INTERNATIONAL JOURNAL OF COGNITIVE INFORMATICS AND NATURAL INTELLIGENCE, 2021, 15 (02) : 96 - 109
  • [32] An Arabic Script Recognition System
    Alginahi, Yasser M.
    Mudassar, Mohammed
    Kabir, Muhammad Nomani
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2015, 9 (09): : 3701 - 3720
  • [33] Footprint Based Recognition System
    Kumar, V. D. Ambeth
    Ramakrishan, M.
    INFORMATION TECHNOLOGY AND MOBILE COMMUNICATION, 2011, 147 : 358 - +
  • [34] Automatic Modulation Recognition Based on Hybrid Neural Network
    Duan, Qiang
    Fan, Jianhua
    Wei, Xianglin
    Wang, Chao
    Jiao, Xiang
    Wei, Nan
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2021, 2021
  • [35] Chinese Clinical Named Entity Recognition From Electronic Medical Records Based on Multisemantic Features by Using Robustly Optimized Bidirectional Encoder Representation From Transformers Pretraining Approach Whole Word Masking and Convolutional Neural Networks: Model Development and Validation
    Wang, Weijie
    Li, Xiaoying
    Ren, Huiling
    Gao, Dongping
    Fang, An
    JMIR MEDICAL INFORMATICS, 2023, 11
  • [36] Chemical Species Recognition in a Tetragnatha Spider (Araneae: Tetragnathidae)
    Adams, Seira A.
    Gerbaulet, Moritz
    Schulz, Stefan
    Gillespie, Rosemary G.
    Uhl, Gabriele
    JOURNAL OF CHEMICAL ECOLOGY, 2021, 47 (01) : 63 - 72
  • [37] Advanced Motion as a Hybrid System
    Suzuki, Tatsuya
    ELECTRONICS AND COMMUNICATIONS IN JAPAN, 2010, 93 (12) : 35 - 43
  • [38] Recognition of alternatively spliced cassette exons based on a hybrid model
    Zhang, Xiaokang
    Peng, Qinke
    Li, Liang
    Li, Xintong
    BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2016, 471 (03) : 368 - 372
  • [39] A Survey on Fingerprint Biometric Recognition System
    Rathod, Varsha J.
    Iyer, Nalini C.
    Meena, S. M.
    2015 INTERNATIONAL CONFERENCE ON GREEN COMPUTING AND INTERNET OF THINGS (ICGCIOT), 2015, : 323 - 326
  • [40] Contactless Palmprint Recognition System: A Survey
    Alausa, Dele W. S.
    Adetiba, Emmanuel
    Badejo, Joke A. A.
    Davidson, Innocent Ewean
    Obiyemi, Obiseye
    Buraimoh, Elutunji
    Abayomi, Abdultaofeek
    Oshin, Oluwadamilola
    IEEE ACCESS, 2022, 10 : 132483 - 132505