Cross-Lingual Cross-Domain Nested Named Entity Evaluation on EnglishWeb Texts

被引:0
作者
Plank, Barbara [1 ]
机构
[1] IT Univ Copenhagen, Dept Comp Sci, Rued Langgaards Vej 7, DK-2300 Copenhagen S, Denmark
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021 | 2021年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Named Entity Recognition (NER) is a key Natural Language Processing task. However, most existing work on NER targets flat named entities (NEs) and ignores the recognition of nested structures, where entities can be enclosed within other NEs. Moreover, evaluation of Nested Named Entity Recognition (NNER) across domains remains challenging, mainly due to the limited availability of datasets. To address these gaps, we present EWT-NNER, a dataset covering five web domains annotated for nested named entities on top of the English Web Treebank (EWT). We present the corpus and an empirical evaluation, including transfer results from German and Danish. EWTNNER is annotated for four major entity types, including suffixes for derivational entity markers and partial named entities, spanning a total of 12 classes. We envision the public release of EWT- NNER to encourage further research on nested NER, particularly on cross-lingual cross-domain evaluation.
引用
收藏
页码:1808 / 1815
页数:8
相关论文
共 36 条
[1]  
Abend Omri, 2020, P 28 INT C COMP LING, P1
[2]  
Alex B., 2007, P BIOL TRANSL CLIN L, P65
[3]  
Baldwin T., 2015, Proceedings of the Workshop on Noisy User-generated Text, P126, DOI 10.18653/v1/W15-4319
[4]  
Bender Emily M., 2018, Transactions of the Association for Computational Linguistics, V6, P587, DOI 10.1162/tacl_a_00041
[5]  
Benikova D, 2014, LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, P2524
[6]  
Benikova Darina, 2014, GermEval 2014 named entity recognition shared task: companion paper
[7]  
Bies Ann, 2012, Web Download
[8]  
Conneau A, 2020, P 58 ANN M ASS COMP, P8440, DOI DOI 10.18653/V1/2020.ACL-MAIN.747
[9]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[10]  
Finkel JR, 2009, EMNLP, P141