ChemSpot: a hybrid system for chemical named entity recognition

被引:161
|
作者
Rocktaschel, Tim [1 ]
Weidlich, Michael [1 ]
Leser, Ulf [1 ]
机构
[1] Humboldt Univ, Dept Comp Sci, D-12489 Berlin, Germany
关键词
BIOMEDICAL TEXT; RECONSTRUCTION; IDENTIFICATION; DICTIONARY;
D O I
10.1093/bioinformatics/bts183
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The accurate identification of chemicals in text is important for many applications, including computer-assisted reconstruction of metabolic networks or retrieval of information about substances in drug development. But due to the diversity of naming conventions and traditions for such molecules, this task is highly complex and should be supported by computational tools. Results: We present ChemSpot, a named entity recognition (NER) tool for identifying mentions of chemicals in natural language texts, including trivial names, drugs, abbreviations, molecular formulas and International Union of Pure and Applied Chemistry entities. Since the different classes of relevant entities have rather different naming characteristics, ChemSpot uses a hybrid approach combining a Conditional Random Field with a dictionary. It achieves an F-1 measure of 68.1% on the SCAI corpus, outperforming the only other freely available chemical NER tool, OSCAR4, by 10.8 percentage points.
引用
收藏
页码:1633 / 1640
页数:8
相关论文
共 50 条
  • [41] A Fuzzy Inferencing System for Gait Recognition
    Roy, Aditi
    Sural, Shamik
    2009 ANNUAL MEETING OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY, 2009, : 530 - 535
  • [42] Door Knob Hand Recognition System
    Qu, Xiaofeng
    Zhang, David
    Lu, Guangming
    Guo, Zhenhua
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2017, 47 (11): : 2870 - 2881
  • [43] Chemical tongues: biomimetic recognition using arrays of synthetic polymers
    Tomita, Shunsuke
    POLYMER JOURNAL, 2022, 54 (07) : 851 - 862
  • [44] Discrimination of instant coffee by pattern recognition of chemical oscillation fingerprints
    Li, Y. L.
    Li, G. Y.
    Zeng, R.
    Chen, W.
    Li, C.
    Zhang, M. X.
    ANALYTICAL METHODS, 2014, 6 (16) : 6555 - 6559
  • [45] Review of various stages in speaker recognition system, performance measures and recognition toolkits
    Pawar, Rupali V.
    Jalnekar, Rajesh M.
    Chitode, Janardan S.
    ANALOG INTEGRATED CIRCUITS AND SIGNAL PROCESSING, 2018, 94 (02) : 247 - 257
  • [46] Control chart pattern recognition using a novel hybrid intelligent method
    Ranaee, Vahid
    Ebrahimzadeh, Ata
    APPLIED SOFT COMPUTING, 2011, 11 (02) : 2676 - 2686
  • [47] Robust Gait Recognition under Unconstrained Environments using Hybrid Descriptions
    Yao, Lingxiang
    Kusakunniran, Worapan
    Wu, Qiang
    Zhang, Jian
    Tang, Zhenmin
    2017 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING - TECHNIQUES AND APPLICATIONS (DICTA), 2017, : 441 - 447
  • [48] The development of a hybrid PEMFC power system
    Guo, Yi-Fu
    Chen, Han-Che
    Wang, Fu-Cheng
    INTERNATIONAL JOURNAL OF HYDROGEN ENERGY, 2015, 40 (13) : 4630 - 4640
  • [49] Iris recognition system development using Matlab
    Jagadeesh, N.
    Patil, Chandrasekhar M.
    2017 INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC), 2017, : 348 - 353
  • [50] A Systematic Review of Fingerprint Recognition System Development
    Appati, Justice Kwame
    Nartey, Prince Kofi
    Yaokumah, Winfred
    Abdulai, Jamal-Deen
    INTERNATIONAL JOURNAL OF SOFTWARE SCIENCE AND COMPUTATIONAL INTELLIGENCE-IJSSCI, 2022, 14 (01):