KIND: an Italian Multi-Domain Dataset for Named-Entity Recognition

被引:0
作者
Paccosi, Teresa [1 ,2 ]
Aprosio, Alessio Palmero [1 ]
机构
[1] Fdn Bruno Kessler, Via Sommarive 18, Trento, Italy
[2] Univ Trento, Corso Bettini 84, Rovereto, Italy
来源
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | 2022年
关键词
Named-entity recognition; Italian language; Natural Language Processing;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper we present KIND, an Italian dataset for Named-entity recognition. It contains more than one million tokens with annotation covering three classes: person, location, and organization. The dataset (around 600K tokens) mostly contains manual gold annotations in three different domains (news, literature, and political discourses) and a semi-automatically annotated part. The multi-domain feature is the main strength of the present work, offering a resource which covers different styles and language uses, as well as the largest Italian NER dataset with manual gold annotations. It represents an important resource for the training of NER systems in Italian. Texts and annotations are freely downloadable from the Github repository.
引用
收藏
页码:501 / 507
页数:7
相关论文
共 50 条
  • [21] GoalBERT: A Lightweight Named-Entity Recognition Model Based on Multiple Fusion
    Xu, Yingjie
    Tan, Xiaobo
    Wang, Mengxuan
    Zhang, Wenbo
    APPLIED SCIENCES-BASEL, 2024, 14 (23):
  • [22] A Named Entity Recognition Dataset for Turkish
    Kucuk, Dilek
    Kucuk, Dogan
    Arici, Nursal
    2016 24TH SIGNAL PROCESSING AND COMMUNICATION APPLICATION CONFERENCE (SIU), 2016, : 329 - 332
  • [23] Re-ranking for Joint Named-Entity Recognition and Linking
    Sil, Avirup
    Yates, Alexander
    PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 2369 - 2374
  • [24] A Survey of Named-Entity Recognition Methods for Food Information Extraction
    Popovski, Gorjan
    Seljak, Barbara Korousic
    Eftimov, Tome
    IEEE ACCESS, 2020, 8 : 31586 - 31594
  • [25] FLightNER: A Federated Learning Approach to Lightweight Named-Entity Recognition
    Abadeer, Macarious
    Shi, Wei
    Corriveau, Jean-Pierre
    2022 IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, 2022, : 687 - 694
  • [26] Zero-shot evaluation of ChatGPT for food named-entity recognition and linking
    Ogrinc, Matevz
    Korousic Seljak, Barbara
    Eftimov, Tome
    FRONTIERS IN NUTRITION, 2024, 11
  • [27] KazNERD: Kazakh Named Entity Recognition Dataset
    Yeshpanov, Rustem
    Khassanov, Yerbolat
    Varol, Huseyin Atakan
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 417 - 426
  • [28] DarNERcorp: An annotated named entity recognition dataset in the Moroccan dialect
    Moussa, Hanane Nour
    Mourhir, Asmaa
    DATA IN BRIEF, 2023, 48
  • [29] Named-Entity Recognition and Data Visualization Techniques to Communicate Mission Command to Autonomous Systems
    Chesworth, Donald
    Harmon, Nathan
    Tanner, Leslie
    Guerlain, Stephanie
    Balazs, Michael
    2016 IEEE SYSTEMS AND INFORMATION ENGINEERING DESIGN SYMPOSIUM (SIEDS), 2016, : 233 - 238
  • [30] BiLSTM-SSVM: Training the BiLSTM with a Structured Hinge Loss for Named-Entity Recognition
    Poostchi, Hanieh
    Piccardi, Massimo
    IEEE TRANSACTIONS ON BIG DATA, 2022, 8 (01) : 203 - 212