Entity Linking over Nested Named Entities for Russian

被引:0
作者
Loukachevitch, Natalia [1 ]
Braslavski, Pavel [2 ,3 ]
Ivanov, Vladimir [4 ]
Batura, Tatiana [5 ,6 ]
Manandhar, Suresh [9 ]
Shelmanov, Artem [1 ,7 ]
Tutubalina, Elena [3 ,8 ]
机构
[1] Lomonosov Moscow State Univ, Moscow, Russia
[2] Ural Fed Univ, Ekaterinburg, Russia
[3] HSE Univ, Moscow, Russia
[4] Innopolis Univ, Innopolis, Russia
[5] Novosibirsk State Univ, Novosibirsk, Russia
[6] Ershov Inst Informat Syst, Novosibirsk, Russia
[7] Artificial Intelligence Res Inst, Moscow, Russia
[8] Sber AI, Moscow, Russia
[9] Madan Bhandari Univ Sci & Technol Dev Board, Kathmandu, Nepal
来源
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | 2022年
基金
俄罗斯科学基金会;
关键词
information extraction; entity linking; nested named entities; Russian language; DISAMBIGUATION;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper, we describe entity linking annotation over nested named entities in the recently released Russian NEREL dataset for information extraction. The NEREL collection (Loukachevitch et al., 2021) is currently the largest Russian dataset annotated with entities and relations. The paper describes the main design principles behind NEREL's entity linking annotation, provides its statistics, and reports evaluation results for several entity linking baselines. To date, 38,152 entity mentions in 933 documents are linked to Wikidata. The NEREL dataset is publicly available: https://github.com/nerel-ds/NEREL.
引用
收藏
页码:4458 / 4466
页数:9
相关论文
共 38 条
[1]  
[Anonymous], 2015, P 9 INT WORKSHOP SEM, DOI DOI 10.18653/V1/S15-2049
[2]  
[Anonymous], 2014, P RCDL 2014
[3]   DBpedia: A nucleus for a web of open data [J].
Auer, Soeren ;
Bizer, Christian ;
Kobilarov, Georgi ;
Lehmann, Jens ;
Cyganiak, Richard ;
Ives, Zachary .
SEMANTIC WEB, PROCEEDINGS, 2007, 4825 :722-+
[4]  
Benikova D, 2014, LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, P2524
[5]  
Bollacker K., 2008, P 2008 ACM SIGMOD IN, P1247, DOI 10.1145/1376616.1376746
[6]  
Botha JA, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P7833
[7]  
Bruches E., 2021, ARXIV210906703
[8]  
De Cao Nicola, 2021, 210312528 ARXIV
[9]  
Ellis J., 2016, TAC KBP SPANISH CROS
[10]  
Ellis J., 2017, TAC KBP CHINESE CROS