Dynamically Transfer Entity Span Information for Cross-domain Chinese Named Entity Recognition

被引:0
作者
Wu B.-C. [1 ,3 ]
Deng C.-L. [1 ,3 ]
Guan B. [1 ]
Chen X.-L. [1 ,3 ]
Zan D.-G. [1 ,3 ]
Chang Z.-J. [4 ]
Xiao Z.-Y. [5 ]
Qu D.-C. [5 ]
Wang Y.-J. [1 ,2 ,3 ]
机构
[1] Collaborative Innovation Center, Institute of Software, Chinese Academy of Sciences, Beijing
[2] State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing
[3] University of Chinese Academy of Sciences, Beijing
[4] National Science Library, Chinese Academy of Sciences, Beijing
[5] School of Computer Science and Technology, Beijing Institute of Technology, Beijing
来源
Ruan Jian Xue Bao/Journal of Software | 2022年 / 33卷 / 10期
关键词
bidirectional long short-term memory (BiLSTM) neural network; cross-domain; dynamic fusion; named entity recognition (NER); transfer learning;
D O I
10.13328/j.cnki.jos.006305
中图分类号
学科分类号
摘要
Boundaries identification of Chinese named entities is a difficult problem because of no separator between Chinese texts. Furthermore, the lack of well-marked NER data makes Chinese named entity recognition (NER) tasks more challenging in vertical domains, such as clinical domain and financial domain. To address aforementioned issues, this study proposes a novel cross-domain Chinese NER model by dynamically transferring entity span information (TES-NER). The cross-domain shared entity span information is transferred from the general domain (source domain) with sufficient corpus to the Chinese NER model on the vertical domain (target domain) through a dynamic fusion layer based on the gate mechanism, where the entity span information is used to represent the scope of the Chinese named entities. Specifically, TES-NER first introduces a cross-domain shared entity span recognition module based on a bidirectional long short-term memory (BiLSTM) layer and a fully connected neural network (FCN) which are used to identify the cross-domain shared entity span information to determine the boundaries of the Chinese named entities. Then, a Chinese NER module is constructed to identify the domain-specific Chinese named entities by applying independent BiLSTM with conditional random field models (BiLSTM-CRF). Finally, a dynamic fusion layer is designed to dynamically determine the amount of the cross-domain shared entity span information extracted from the entity span recognition module, which is used to transfer the knowledge to the domain-specific NER model through the gate mechanism. This study sets the general domain (source domain) dataset as the news domain dataset (MSRA) with sufficient labeled corpus, while the vertical domain (target domain) datasets are composed of three datasets: Mixed domain (OntoNotes 5.0), financial domain (Resume), and medical domain (CCKS 2017). Among them, the mixed domain dataset (OntoNotes 5.0) is a corpus integrating six different vertical domains. The F1 values of the model proposed in this study are 2.18%, 1.68%, and 0.99% higher than BiLSTM-CRF, respectively. © 2022 Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:3776 / 3792
页数:16
相关论文
共 34 条
[21]  
Aguilar G, Lopez-Monroy AP, Gonzalez FA, Et al., Modeling noisiness to recognize named entities using multitask neural networks on social media, Proc. of the 2018 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1, pp. 1401-1412, (2018)
[22]  
Zhai F, Potdar S, Xiang B, Et al., Neural models for sequence chunking, Proc. of the 31st AAAI Conf. on Artificial Intelligence, pp. 3365-3371, (2017)
[23]  
Xiao S, Ouyang Y, Rong W, Et al., Similarity based auxiliary classifier for named entity recognition, Proc. of the 2019 Conf. on Empirical Methods in Natural Language Processing and the 9th Int’l Joint Conf. on Natural Language Processing (EMNLP-IJCNLP), pp. 1140-1149, (2019)
[24]  
Zheng C, Cai Y, Xu J, Et al., A boundary-aware neural model for nested named entity recognition, Proc. of the 2019 Conf. on Empirical Methods in Natural Language Processing and the 9th Int’l Joint Conf. on Natural Language Processing (EMNLP-IJCNLP), pp. 357-366, (2019)
[25]  
Dong C, Wu H, Zhang J, Et al., Multichannel LSTM-CRF for named entity recognition in Chinese social media, Proc. of the Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data, pp. 197-208, (2017)
[26]  
Lee JY, Dernoncourt F, Szolovits P., Transfer learning for named-entity recognition with neural networks, Proc. of the 11th Int’l Conf. on Language Resources and Evaluation (LREC 2018), (2018)
[27]  
Weischedel R, Palmer M, Marcus M, Et al., Ontonotes release 5.0 ldc2013t19, Proc. of the Linguistic Data Consortium, (2013)
[28]  
Zhang Y, Yang J., Chinese NER using lattice LSTM, Proc. of the 56th Annual Meeting of the Association for Computational Linguistics (Vol.1: Long Papers), pp. 1554-1564, (2018)
[29]  
Pradhan S, Moschitti A, Xue N, Et al., Towards robust linguistic analysis using ontonotes, Proc. of the 17th Conf. on Computational Natural Language Learning, pp. 143-152, (2013)
[30]  
Jie Z, Lu W., Dependency-guided LSTM-CRF for named entity recognition, Proc. of the 2019 Conf. on Empirical Methods in Natural Language Processing and the 9th Int’l Joint Conf. on Natural Language Processing (EMNLP-IJCNLP), pp. 3853-3863, (2019)