A Research Toward Chinese Named Entity Recognition Based on Transfer Learning

被引:7
作者
Kang, Hui [1 ,2 ]
Xiao, Jingwu [3 ]
Zhang, Yunpeng [3 ]
Zhang, Lei [3 ]
Zhao, Xu [3 ]
Feng, Tie [1 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China
[2] Jilin Univ, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun 130012, Peoples R China
[3] Jilin Univ, Coll Software, Changchun 130012, Peoples R China
关键词
Named entity recognition; Transfer learning; LSTM; CRF;
D O I
10.1007/s44196-023-00244-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To improve the performance of named entity recognition in the lack of well-annotated entity data, a transfer learning-based Chinese named entity recognition model is proposed in this paper. The specific tasks are as follows: (1) first/, a data transfer method based on entity features is proposed. By calculating the similarity of feature distribution between low resource data and high resource data, the most representative entity features are selected for feature transfer mapping, and the distance of entity distribution between the two domains is calculated to make up the gap between the data of the two domains then model is trained by high resource data. (2) Then, an entity boundary detection method is proposed. This method utilizes the BiLSTM+CRF as the main structure and integrates character boundary information to assist the attention network to improve the model's ability to recognize entity boundaries. (3) Finally, multiple named entity recognition methods are selected as baseline methods for comparison, and experiments are conducted on several datasets. The results show that the model proposed in this paper improves the accuracy of named entity recognition by 1%, the recall rate by 2%, and the F1 value by 2% on average in the field with low-resource.
引用
收藏
页数:15
相关论文
共 29 条
[1]  
Ando RK, 2005, J MACH LEARN RES, V6, P1817
[2]  
Bikel D. M., 1998, arXiv
[3]  
Cao PF, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P182
[4]  
Chen SD, 2020, Radio Communications Technology, V46, P251
[5]  
Chiu J., 2016, Transactions of the Association for Computational Linguistics, V4, P357
[6]  
Collins M., 1999, P JOINT SIGDAT C EMP
[7]  
Collobert R, 2011, J MACH LEARN RES, V12, P2493
[8]  
DeGen H., 2003, J CHIN INF PROCESS, V17, P37
[9]  
Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
[10]   Character-Based LSTM-CRF with Radical-Level Features for Chinese Named Entity Recognition [J].
Dong, Chuanhai ;
Zhang, Jiajun ;
Zong, Chengqing ;
Hattori, Masanori ;
Di, Hui .
NATURAL LANGUAGE UNDERSTANDING AND INTELLIGENT APPLICATIONS (NLPCC 2016), 2016, 10102 :239-250