TFM: A Triple Fusion Module for Integrating Lexicon Information in Chinese Named Entity Recognition

被引:12
作者
Liu, Haitao [1 ]
Song, Jihua [1 ]
Peng, Weiming [1 ]
Sun, Jingbo [1 ]
Xin, Xianwei [1 ]
机构
[1] Beijing Normal Univ, Sch Artificial Intelligence, Beijing 100875, Peoples R China
基金
中国国家自然科学基金;
关键词
Chinese named entity recognition; Lexicon information; Information fusion; Natural language processing;
D O I
10.1007/s11063-022-10768-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the characteristics of the Chinese writing system, character-based Chinese named entity recognition models ignore the word information in sentences, which harms their performance. Recently, many works try to alleviate the problem by integrating lexicon information into character-based models. These models, however, either simply concatenate word embeddings, or have complex structures which lead to low efficiency. Furthermore, word information is viewed as the only resource from lexicon, thus the value of lexicon is not fully explored. In this work, we observe another neglected information, i.e., character position in a word, which is beneficial for identifying character meanings. To fuse character, word and character position information, we modify the key-value memory network and propose a triple fusion module, termed as TFM. TFM is not limited to simple concatenation or suffers from complicated computation, compatibly working with the general sequence labeling model. Experimental evaluations show that our model has performance superiority. The F1-scores on Resume, Weibo and MSRA are 96.19%, 71.12% and 95.63% respectively.
引用
收藏
页码:3425 / 3442
页数:18
相关论文
共 50 条
[21]   Survey on Chinese named entity recognition with deep learning [J].
Kang Y. ;
Sun L. ;
Zhu R. ;
Li M. .
Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2022, 50 (11) :44-53
[22]   Chinese Named Entity Recognition with Integrated Channel Attention [J].
Song, Wei ;
Zheng, He ;
Guo, Wei ;
Ning, Keqing .
2024 5TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COMPUTER ENGINEERING, ICAICE, 2024, :55-63
[23]   Enhanced Chinese Domain Named Entity Recognition: An Approach with Lexicon Boundary and Frequency Weight Features [J].
Guo, Yan ;
Feng, Shixiang ;
Liu, Fujiang ;
Lin, Weihua ;
Liu, Hongchen ;
Wang, Xianbin ;
Su, Junshun ;
Gao, Qiankai .
APPLIED SCIENCES-BASEL, 2024, 14 (01)
[24]   MSFM: Multi-view Semantic Feature Fusion Model for Chinese Named Entity Recognition [J].
Liu, Jingxin ;
Cheng, Jieren ;
Peng, Xin ;
Zhao, Zeli ;
Tang, Xiangyan ;
Sheng, Victor S. .
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2022, 16 (06) :1833-1848
[25]   A Chinese Named Entity Recognition Method Based on Fusion of Character and Word Features [J].
Chai, Wenguang ;
Wang, Jiazhen .
2022 IEEE 14TH INTERNATIONAL CONFERENCE ON ADVANCED INFOCOMM TECHNOLOGY (ICAIT 2022), 2022, :308-313
[26]   A chinese named entity recognition method for small-scale dataset based on lexicon and unlabeled data [J].
Shaobin Huang ;
Yongpeng Sha ;
Rongsheng Li .
Multimedia Tools and Applications, 2023, 82 :2185-2206
[27]   Survey of Chinese Named Entity Recognition Research [J].
Zhao, Jigui ;
Qian, Yurong ;
Wang, Kui ;
Hou, Shuxiang ;
Chen, Jiaying .
Computer Engineering and Applications, 2024, 60 (01) :15-27
[28]   A chinese named entity recognition method for small-scale dataset based on lexicon and unlabeled data [J].
Huang, Shaobin ;
Sha, Yongpeng ;
Li, Rongsheng .
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (02) :2185-2206
[29]   Research on Named Entity Recognition in Ancient Chinese Based on Incremental Pre-training and Domain Lexicon [J].
Kang, Wenjun ;
Zuo, Jiali ;
Dai, Qili ;
Hu, Yiyu ;
Wang, Mingwen .
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT I, NLPCC 2024, 2025, 15359 :483-503
[30]   Fusion of Long Distance Dependency Features for Chinese Named Entity Recognition Based on Markov Logic Networks [J].
Wu, Zejian ;
Yu, Zhengtao ;
Guo, Jianyi ;
Mao, Cunli ;
Zhang, Youmin .
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, 2012, 333 :132-142