Large Language Models for Latvian Named Entity Recognition

被引:5
作者
Viksna, Rinalds [1 ,2 ]
Skadina, Inguna [1 ,2 ]
机构
[1] Tilde, Vienibas Gatve 75a, LV-1004 Riga, Latvia
[2] Univ Latvia, Fac Comp, Riga, Latvia
来源
HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE (HLT 2020) | 2020年 / 328卷
关键词
Named entity recognition; NER; Latvian language; BERT;
D O I
10.3233/FAIA200603
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer-based language models pre-trained on large corpora have demonstrated good results on multiple natural language processing tasks for widely used languages including named entity recognition (NER). In this paper, we investigate the role of the BERT models in the NER task for Latvian. We introduce the BERT model pre-trained on the Latvian language data. We demonstrate that the Latvian BERT model, pre-trained on large Latvian corpora, achieves better results (81.91 F1-measure on average vs 78.37 on M-BERT for a dataset with nine named entity types, and 79.72 vs 78.83 on another dataset with seven types) than multilingual BERT and outperforms previously developed Latvian NER systems.
引用
收藏
页码:62 / 69
页数:8
相关论文
共 18 条
[1]  
[Anonymous], 2019, PORTUGUESE NAMED ENT
[2]  
Arkhipov M, 2019, 7TH WORKSHOP ON BALTO-SLAVIC NATURAL LANGUAGE PROCESSING (BSNLP'2019), P89
[3]  
Chinchor N, 1998, MUC 7 NAMED ENTITY T
[4]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[5]  
Gruzitis N, 2018, PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), P4506
[6]  
Kudo T, 2018, CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, P66
[7]   A Survey on Deep Learning for Named Entity Recognition [J].
Li, Jing ;
Sun, Aixin ;
Han, Jianglei ;
Li, Chenliang .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (01) :50-70
[8]   Named Entity Recognition: Fallacies, challenges and opportunities [J].
Marrero, Monica ;
Urbano, Julian ;
Sanchez-Cuadrado, Sonia ;
Morato, Jorge ;
Miguel Gomez-Berbis, Juan .
COMPUTER STANDARDS & INTERFACES, 2013, 35 (05) :482-489
[9]  
Muller B., 2019, 5 WORKSH NOIS US GEN, DOI [10.18653/v1/D19-5539, DOI 10.18653/V1/D19-5539]
[10]  
Peters M. E., 2018, P 2018 C N AM CHAPT, V1, P2227, DOI [10.18653/V1/N18-1202, DOI 10.18653/V1/N18-1202]