ADPG: Biomedical entity recognition based on Automatic Dependency Parsing Graph

被引:1
作者
Yang, Yumeng [1 ]
Lin, Hongfei [1 ]
Yang, Zhihao [1 ]
Zhang, Yijia [2 ]
Zhao, Di [3 ]
Huai, Shuaiheng [2 ]
机构
[1] Dalian Univ Technol, Sch Comp Sci & Technol, Dalian, Peoples R China
[2] Dalian Maritime Univ, Sch Informat Sci & Technol, Dalian, Peoples R China
[3] Dalian Minzu Univ, Sch Comp Sci & Engn, Dalian, Peoples R China
基金
中国博士后科学基金;
关键词
NER; Tree-transformer; Dependency parsing; Biomedical;
D O I
10.1016/j.jbi.2023.104317
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Named entity recognition is a key task in text mining. In the biomedical field, entity recognition focuses on extracting key information from large-scale biomedical texts for the downstream information extraction task. Biomedical literature contains a large amount of long-dependent text, and previous studies use external syntactic parsing tools to capture word dependencies in sentences to achieve nested biomedical entity recognition. However, the addition of external parsing tools often introduces unnecessary noise to the current auxiliary task and cannot improve the performance of entity recognition in an end-to-end way. Therefore, we propose a novel automatic dependency parsing approach, namely the ADPG model, to fuse syntactic structure information in an end-to-end way to recognize biomedical entities. Specifically, the method is based on a multilayer Tree-Transformer structure to automatically extract the semantic representation and syntactic structure in long-dependent sentences, and then combines a multilayer graph attention neural network (GAT) to extract the dependency paths between words in the syntactic structure to improve the performance of biomedical entity recognition. We evaluated our ADPG model on three biomedical domain and one news domain datasets, and the experimental results demonstrate that our model achieves state-of-the-art results on these four datasets with certain generalization performance. Our model is released on GitHub: https://github.com/Yumeng-Y/ADPG.
引用
收藏
页数:11
相关论文
共 50 条
[41]   Computational Reproducibility of Named Entity Recognition methods in the biomedical domain [J].
Garcia-Serrano, Ana ;
Hennig, Sebastian ;
Nuernberger, Andreas .
PROCESAMIENTO DEL LENGUAJE NATURAL, 2021, (66) :141-152
[42]   Apply a rough set-based classifier to dependency parsing [J].
Ji, Yangsheng ;
Shang, Lin ;
Dai, Xinyu ;
Ma, Ruoce .
ROUGH SETS AND KNOWLEDGE TECHNOLOGY, 2008, 5009 :97-105
[43]   Reinforcement of BERT with Dependency-Parsing Based Attention Mask [J].
Mechouma, Toufik ;
Biskri, Ismail ;
Meunier, Jean Guy .
ADVANCES IN COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2022, 2022, 1653 :112-122
[44]   Vietnamese Transition-based Dependency Parsing with Supertag Features [J].
Nguyen, Kiet V. ;
Ngan Luu-Thuy Nguyen .
2016 EIGHTH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE), 2016, :175-180
[45]   Dependency Parsing of Estonian: Statistical and Rule-based Approaches [J].
Muischnek, Kadri ;
Mueuerisep, Kaili ;
Puolakainen, Tiina .
HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, BALTIC HLT 2014, 2014, 268 :111-+
[46]   CollaboNet: collaboration of deep neural networks for biomedical named entity recognition [J].
Wonjin Yoon ;
Chan Ho So ;
Jinhyuk Lee ;
Jaewoo Kang .
BMC Bioinformatics, 20
[47]   CollaboNet: collaboration of deep neural networks for biomedical named entity recognition [J].
Yoon, Wonjin ;
So, Chan Ho ;
Lee, Jinhyuk ;
Kang, Jaewoo .
BMC BIOINFORMATICS, 2019, 20 (Suppl 10)
[48]   Tamil Dependency Parsing: Results Using Rule Based and Corpus Based Approaches [J].
Ramasamy, Loganathan ;
Zabokrtsky, Zdenek .
COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, PT I, 2011, 6608 :82-95
[49]   Deep learning-based automatic analysis of legal contracts: a named entity recognition benchmark [J].
Aejas B. ;
Belhi A. ;
Zhang H. ;
Bouras A. .
Neural Computing and Applications, 2024, 36 (23) :14465-14481
[50]   A survey of syntactic-semantic parsing based on constituent and dependency structures [J].
Zhang MeiShan .
SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2020, 63 (10) :1898-1920