Research on Named Entity Recognition Methods in Chinese Forest Disease Texts

被引:2
作者
Wang, Qi [1 ]
Su, Xiyou [1 ]
机构
[1] Beijing Forestry Univ, Sch Informat Sci & Technol, Beijing 100083, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 08期
基金
中国国家自然科学基金;
关键词
disease; named entity recognition; multi-feature; transformer; bi-gated recurrent unit; CRF;
D O I
10.3390/app12083885
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Named entity recognition of forest diseases plays a key role in knowledge extraction in the field of forestry. The aim of this paper is to propose a named entity recognition method based on multi-feature embedding, a transformer encoder, a bi-gated recurrent unit (BiGRU), and conditional random fields (CRF). According to the characteristics of the forest disease corpus, several features are introduced here to improve the method's accuracy. In this paper, we analyze the characteristics of forest disease texts; carry out pre-processing, labeling, and extraction of multiple features; and construct forest disease texts. In the input representation layer, the method integrates multi-features, such as characters, radicals, word boundaries, and parts of speech. Then, implicit features (e.g., sentence context features) are captured through the transformer's encoding layer. The obtained features are transmitted to the BiGRU layer for further deep feature extraction. Finally, the CRF model is used to learn constraints and output the optimal annotation of disease names, damage sites, and drug entities in the forest disease texts. The experimental results on the self-built data set of forest disease texts show that the precision of the proposed method for entity recognition reached more than 93%, indicating that it can effectively solve the task of named entity recognition in forest disease texts.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Integrated Chinese Segmentation, Parsing and Named Entity Recognition
    Li Dongchen
    Zhang Xiantao
    Wu Xihong
    CHINESE JOURNAL OF ELECTRONICS, 2018, 27 (04) : 756 - 760
  • [42] Chinese Named Entity Recognition with New Contextual Features
    Qin, Ying
    Zhang, Taozheng
    Wang, Xiaojie
    IEEE NLP-KE 2008: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2008, : 116 - +
  • [43] Chinese named entity recognition based on Transformer encoder
    Guo X.-R.
    Luo P.
    Wang W.-L.
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2021, 51 (03): : 989 - 995
  • [44] Research on Chinese named entity recognition using combined boundary-PoS feature
    Qiang, Bao-Hua
    Huang, Jun
    Wang, Yu-Feng
    Wang, Sai
    Wang, Yong
    DESIGN, MANUFACTURING AND MECHATRONICS (ICDMM 2015), 2016, : 839 - 848
  • [45] DdERT: Research on Named Entity Recognition for Mine Hoist Using a Chinese BERT Model
    Dang, Xiaochao
    Wang, Li
    Dong, Xiaohui
    Li, Fenfang
    Deng, Han
    ELECTRONICS, 2023, 12 (19)
  • [46] Research on Named Entity Recognition Methods for Urban Underground Space Disasters Based on Text Information Extraction
    Li, Zhaowen
    Zhang, Xuedong
    GEOSPATIAL WEEK 2023, VOL. 48-1, 2023, : 547 - 552
  • [47] Named Entity Recognition and Data Leakage in Legislative Texts: A Literature Reassessment
    Nunes, Rafael Oleques
    Spritzer, Andre Susliz
    Freitas, Carla Maria Dal Sasso
    Balreira, Dennis Giovani
    LINGUAMATICA, 2024, 16 (02):
  • [48] Named entity recognition in greek texts with an ensemble of SVMS and active learning
    Lucarelli, Giorgio
    Vasilakos, Xenofon
    Androutsopoulos, Ion
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2007, 16 (06) : 1015 - 1045
  • [49] CachacaNER: a dataset for named entity recognition in texts about the cachaca beverage
    Silva, Priscilla
    Franco, Arthur
    Santos, Thiago
    Brito, Mozar
    Pereira, Denilson
    LANGUAGE RESOURCES AND EVALUATION, 2024, 58 (04) : 1315 - 1333
  • [50] Unsupervised biomedical named entity recognition: Experiments with clinical and biological texts
    Zhang, Shaodian
    Elhadad, Noemie
    JOURNAL OF BIOMEDICAL INFORMATICS, 2013, 46 (06) : 1088 - 1098