Semantic-enhanced graph neural network for named entity recognition in ancient Chinese books

被引:0
作者
Xu, Yongrui [1 ]
Mao, Caixia [2 ]
Wang, Zhiyong [1 ]
Jin, Guonian [1 ]
Zhong, Liangji [1 ]
Qian, Tao [1 ]
机构
[1] Hubei Univ Sci & Technol, Sch Comp Sci & Technol, Xianning 437100, Peoples R China
[2] Hubei Univ Sci & Technol, Sch Elect & Informat Engn, Xianning 437100, Peoples R China
来源
SCIENTIFIC REPORTS | 2024年 / 14卷 / 01期
基金
中国国家自然科学基金; 国家教育部科学基金资助;
关键词
Named entity recognition; Graph neural network; Ancient Chinese; Graph attention mechanism;
D O I
10.1038/s41598-024-68561-x
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Named entity recognition (NER) plays a crucial role in the extraction and utilization of knowledge of ancient Chinese books. However, the challenges of ancient Chinese NER not only originate from linguistic features such as the use of single characters and short sentences but are also exacerbated by the scarcity of training data. These factors together limit the capability of deep learning models, like BERT-CRF, in capturing the semantic representation of ancient Chinese characters. In this paper, we explore the semantic enhancement of NER in ancient Chinese books through the utilization of external knowledge. We propose a novel model based on Graph Neural Networks that integrates two different forms of external knowledge: dictionary-level and chapter-level information. Through the Graph Attention Mechanism (GAT), these external knowledge are effectively incorporated into the model's input context. Our model is evaluated on the C_CLUE dataset, showing an improvement of 3.82% over the baseline BAC-CRF model. It also achieves the best score compared to several state-of-the-art dictionary-augmented models.
引用
收藏
页数:12
相关论文
共 37 条
[1]   BERT-Based Transfer-Learning Approach for Nested Named-Entity Recognition Using Joint Labeling [J].
Agrawal, Ankit ;
Tripathi, Sarsij ;
Vardhan, Manu ;
Sihag, Vikas ;
Choudhary, Gaurav ;
Dragoni, Nicola .
APPLIED SCIENCES-BASEL, 2022, 12 (03)
[2]  
Brokaw Cynthia., 2005, PRINTING BOOK CULTUR, P3
[3]   A deep learning model incorporating part of speech and self-matching attention for named entity recognition of Chinese electronic medical records [J].
Cai, Xiaoling ;
Dong, Shoubin ;
Hu, Jinlong .
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (Suppl 2)
[4]  
Chang E., 2021, arXiv, DOI DOI 10.48550/ARXIV.2107.03179
[5]   Randomly Wired Graph Neural Network for Chinese NER [J].
Chen, Jie ;
Xi, Xuefeng ;
Sheng, Victor S. ;
Cui, Zhiming .
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 227
[6]  
Cui LY, 2019, Arxiv, DOI arXiv:1908.08676
[7]  
Diao SZ, 2020, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, P4729
[8]  
Feng Ping, 2022, 2022 4th International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI), P137, DOI 10.1109/MLBDBI58171.2022.00033
[9]  
Ge S., 2022, P 2 INT WORKSH NAT L, P167
[10]  
Gui T, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P4982