ChemSpot: a hybrid system for chemical named entity recognition

被引:161
|
作者
Rocktaschel, Tim [1 ]
Weidlich, Michael [1 ]
Leser, Ulf [1 ]
机构
[1] Humboldt Univ, Dept Comp Sci, D-12489 Berlin, Germany
关键词
BIOMEDICAL TEXT; RECONSTRUCTION; IDENTIFICATION; DICTIONARY;
D O I
10.1093/bioinformatics/bts183
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The accurate identification of chemicals in text is important for many applications, including computer-assisted reconstruction of metabolic networks or retrieval of information about substances in drug development. But due to the diversity of naming conventions and traditions for such molecules, this task is highly complex and should be supported by computational tools. Results: We present ChemSpot, a named entity recognition (NER) tool for identifying mentions of chemicals in natural language texts, including trivial names, drugs, abbreviations, molecular formulas and International Union of Pure and Applied Chemistry entities. Since the different classes of relevant entities have rather different naming characteristics, ChemSpot uses a hybrid approach combining a Conditional Random Field with a dictionary. It achieves an F-1 measure of 68.1% on the SCAI corpus, outperforming the only other freely available chemical NER tool, OSCAR4, by 10.8 percentage points.
引用
收藏
页码:1633 / 1640
页数:8
相关论文
共 50 条
  • [21] A LANGUAGE INDEPENDENT NAMED ENTITY RECOGNITION SYSTEM
    Gifu, Daniela
    Vasilache, Gabriela
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE 'LINQUISTIC RESOURCES AND TOOLS FOR PROCESSING THE ROMANIAN LANGUAGE', 2014, 2014, : 181 - 188
  • [22] A Named Entity Recognition System for the Marathi Language
    Vaishali, P. Kadam
    Mahender, Namrata
    JOURNAL OF ADVANCED APPLIED SCIENTIFIC RESEARCH, 2024, 6 (03): : 229 - 243
  • [23] Named Entity Recognition System for the Biomedical Domain
    Sharma, Raghav
    Chauhan, Deependra
    Sharma, Raksha
    PROCEEDINGS OF THE 2022 17TH CONFERENCE ON COMPUTER SCIENCE AND INTELLIGENCE SYSTEMS (FEDCSIS), 2022, : 837 - 840
  • [24] A Hybrid System for Named Entity Metonymy Resolution
    Brun, Caroline
    Ehrmann, Maud
    Jacquet, Guillaume
    HUMAN LANGUAGE TECHNOLOGY: CHALLENGES OF THE INFORMATION SOCIETY, 2009, 5603 : 118 - 130
  • [25] Deep Learning-Based Named Entity Recognition System Using Hybrid Embedding
    Goyal, Archana
    Gupta, Vishal
    Kumar, Manish
    CYBERNETICS AND SYSTEMS, 2024, 55 (02) : 279 - 301
  • [26] A Hybrid Deep Learning Framework for Bacterial Named Entity Recognition
    Li, Xusheng
    Wang, Xiaoyan
    Zhong, Ran
    Zhong, Duo
    He, Tingting
    Hu, Xiaohua
    Jiang, Xingpeng
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 428 - 433
  • [27] Chinese named entity recognition with a hybrid-statistical model
    Zhang, XY
    Wang, T
    Tang, JT
    Zhou, HP
    Chen, HW
    WEB TECHNOLOGIES RESEARCH AND DEVELOPMENT - APWEB 2005, 2005, 3399 : 900 - 912
  • [28] A HYBRID APPROACH FOR CHINESE NAMED ENTITY RECOGNITION IN MUSIC DOMAIN
    Zhang, Xueqing
    Liu, Zhen
    Qiu, Huizhong
    Fu, Yan
    EIGHTH IEEE INTERNATIONAL CONFERENCE ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, PROCEEDINGS, 2009, : 677 - 681
  • [29] A Hybrid Model Based on CRFs for Chinese Named Entity Recognition
    Li, Lishuang
    Ding, Zhuoye
    Huang, Degen
    Zhou, Huiwei
    ALPIT 2008: SEVENTH INTERNATIONAL CONFERENCE ON ADVANCED LANGUAGE PROCESSING AND WEB INFORMATION TECHNOLOGY, PROCEEDINGS, 2008, : 127 - 132
  • [30] Named entity recognition using hybrid machine learning approach
    Chiong, Raymond
    Wei, Wang
    PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS, VOLS 1 AND 2, 2006, : 578 - 583