An Unsupervised Multiple-Task and Multiple-Teacher Model for Cross-lingual Named Entity Recognition

被引:0
作者
Li, Zhuoran [1 ]
Hu, Chunming [1 ]
Guo, Xiaohui [2 ]
Chen, Junfan [1 ]
Qin, Wenyi [1 ]
Zhang, Richong [1 ]
机构
[1] Beihang Univ, Sch Comp Sci & Engn, SKLSDE, Beijing, Peoples R China
[2] Beihang Univ, Hangzhou Innovat Inst, Hangzhou, Peoples R China
来源
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS) | 2022年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-lingual named entity recognition task is one of the critical problems for evaluating the potential transfer learning techniques on low resource languages. Knowledge distillation using pre-trained multilingual language models between source and target languages have shown their superiority in transfer. However, existing cross-lingual distillation models merely consider the potential transferability between two identical single tasks across both domains. Other possible auxiliary tasks to improve the learning performance have not been fully investigated. In this study, based on the knowledge distillation framework and multi-task learning, we introduce the similarity metric model as an auxiliary task to improve the cross-lingual NER performance on the target domain. Specifically, an entity recognizer and a similarity evaluator are first trained in parallel as two teachers from the source domain. Then, two tasks in the student model are supervised by these teachers simultaneously. Empirical studies on the three datasets across 7 different languages confirm the effectiveness of the proposed model.
引用
收藏
页码:170 / 179
页数:10
相关论文
共 24 条
[1]  
Bromley J., 1993, International Journal of Pattern Recognition and Artificial Intelligence, V7, P669, DOI 10.1142/S0218001493000339
[2]  
Chen WL, 2021, 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), P743
[3]  
Chiu Jason P. C., 2016, Named entity recognition with bidirectional lstm-cnns
[4]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[5]  
Jain A, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P1083
[6]  
Keung P, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P1355
[7]  
Kingma DP, 2014, ADV NEUR IN, V27
[8]  
Koch G, 2015, ICML DEEP LEARNING W, P2
[9]  
Lample G., 2016, P C N AM CHAPT ASS C, P260, DOI [10.18653/v1/N16-1030, DOI 10.18653/V1/N16-1030]
[10]  
Liang Shining, P 27 ACM SIGKDD C KN, P3231