Constrained Labeled Data Generation for Low-Resource Named Entity Recognition

被引:0
|
作者
Guo, Ruohao [1 ]
Roth, Dan [2 ]
机构
[1] Univ Illinois, Urbana, IL 61801 USA
[2] Univ Penn, Philadelphia, PA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Named Entity Recognition (NER) in lowresource languages has been a long-standing challenge in NLP. Recent work has shown great progress in two directions: developing cross-lingual features/models to transfer knowledge to low-resource languages, and translating source-language training data into low-resource target-language training data by projecting annotations with cheap resources. We focus on the second direction in this study. Existing methods suffer from the low quality of the resulting annotated data in the target language; for example, they cannot handle word order and lexical ambiguity well. To handle these limitations we propose a novel approach that uses the projected annotation to generate pseudo supervised data with a transformer language model and a constrained beam search. This allows us to generate more diverse, higher quality, as well as higher quantities of annotated data in the target language. Experiments demonstrate that, when combining our method with available cross-lingual features, it achieves state-of-the-art or competitive performance on NER in a low-resource setting, especially for languages that are distant from our source language, English.(1)
引用
收藏
页码:4519 / 4533
页数:15
相关论文
共 50 条
  • [1] AUC Maximization for Low-Resource Named Entity Recognition
    Nguyen, Ngoc Dang
    Tan, Wei
    Du, Lan
    Buntine, Wray
    Beare, Richard
    Chen, Changyou
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 13389 - 13399
  • [2] Improving Low-resource Named Entity Recognition with Graph Propagated Data Augmentation
    Cai, Jiong
    Huang, Shen
    Jiang, Yong
    Tan, Zeqi
    Xie, Pengjun
    Tu, Kewei
    61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 110 - 118
  • [3] Exogenous and Endogenous Data Augmentation for Low-Resource Complex Named Entity Recognition
    Zhang, Xinghua
    Chen, Gaode
    Cui, Shiyao
    Sheng, Jiawei
    Liu, Tingwen
    Xu, Hongbo
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 630 - 640
  • [4] Biomedical Named Entity Recognition Under Low-Resource Situation
    Zhao, Jianfei
    Ren, Xiangyu
    Zhao, Shuo
    Li, Jinyi
    HEALTH INFORMATION PROCESSING. EVALUATION TRACK PAPERS, 2023, 1773 : 41 - 47
  • [5] Dual Adversarial Neural Transfer for Low-Resource Named Entity Recognition
    Zhou, Joey Tianyi
    Zhang, Hao
    Jin, Di
    Zhu, Hongyuan
    Fang, Meng
    Goh, Rick Siow Mong
    Kwok, Kenneth
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3461 - 3471
  • [6] Converse Attention Knowledge Transfer for Low-Resource Named Entity Recognition
    School of Computer Science and Technology, University of Science and Technology of China, Hefei
    230027, China
    不详
    639798, Singapore
    Int. J. Crowd. Sci., 2024, 3 (140-148):
  • [7] Knowledge-Enriched Prompt for Low-Resource Named Entity Recognition
    Hou, Wenlong
    Zhao, Weidong
    Liu, Xianhui
    Guo, Wenyan
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (05)
  • [8] LELNER: A Lightweight and Effective Low-resource Named Entity Recognition model
    Zhang, Zhanjun
    Zhang, Haoyu
    Wan, Qian
    Liu, Jie
    KNOWLEDGE-BASED SYSTEMS, 2022, 251
  • [9] A Word Representation to Improve Named Entity Recognition in Low-resource Languages
    Mbouopda, Michael Franklin
    Yonta, Paulin Melatagia
    2019 SIXTH INTERNATIONAL CONFERENCE ON SOCIAL NETWORKS ANALYSIS, MANAGEMENT AND SECURITY (SNAMS), 2019, : 333 - 337
  • [10] A Robust and Domain-Adaptive Approach for Low-Resource Named Entity Recognition
    Yu, Houjin
    Mao, Xian-Ling
    Chi, Zewen
    Wei, Wei
    Huang, Heyan
    11TH IEEE INTERNATIONAL CONFERENCE ON KNOWLEDGE GRAPH (ICKG 2020), 2020, : 297 - 304