Constrained Labeled Data Generation for Low-Resource Named Entity Recognition

被引:0
|
作者
Guo, Ruohao [1 ]
Roth, Dan [2 ]
机构
[1] Univ Illinois, Urbana, IL 61801 USA
[2] Univ Penn, Philadelphia, PA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Named Entity Recognition (NER) in lowresource languages has been a long-standing challenge in NLP. Recent work has shown great progress in two directions: developing cross-lingual features/models to transfer knowledge to low-resource languages, and translating source-language training data into low-resource target-language training data by projecting annotations with cheap resources. We focus on the second direction in this study. Existing methods suffer from the low quality of the resulting annotated data in the target language; for example, they cannot handle word order and lexical ambiguity well. To handle these limitations we propose a novel approach that uses the projected annotation to generate pseudo supervised data with a transformer language model and a constrained beam search. This allows us to generate more diverse, higher quality, as well as higher quantities of annotated data in the target language. Experiments demonstrate that, when combining our method with available cross-lingual features, it achieves state-of-the-art or competitive performance on NER in a low-resource setting, especially for languages that are distant from our source language, English.(1)
引用
收藏
页码:4519 / 4533
页数:15
相关论文
共 50 条
  • [21] DualNER: A Trigger-Based Dual Learning Framework for Low-Resource Named Entity Recognition
    Zhong, Maosheng
    Liu, GanLin
    Xiong, Jian
    Zuo, Jiali
    IEEE INTELLIGENT SYSTEMS, 2022, 37 (04) : 79 - 87
  • [22] A Low-Resource Named Entity Recognition Method for Cultural Heritage Field Incorporating Knowledge Fusion
    Li C.
    Hou X.
    Qiao X.
    Beijing Daxue Xuebao (Ziran Kexue Ban)/Acta Scientiarum Naturalium Universitatis Pekinensis, 2024, 60 (01): : 13 - 22
  • [23] CoTea: Collaborative teaching for low-resource named entity recognition with a divide-and-conquer strategy
    Yang, Zhiwei
    Ma, Jing
    Yang, Kang
    Lin, Huiru
    Chen, Hechang
    Yang, Ruichao
    Chang, Yi
    INFORMATION PROCESSING & MANAGEMENT, 2024, 61 (03)
  • [24] MAST-NER: A Low-Resource Named Entity Recognition Method Based on Trigger Pool
    Xu, Juxiong
    Li, Minbo
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2022, PT III, 2022, 13370 : 65 - 76
  • [25] Semi-supervised Named Entity Recognition for Low-Resource Languages Using Dual PLMs
    Yohannes, Hailemariam Mehari
    Lynden, Steven
    Amagasa, Toshiyuki
    Matono, Akiyoshi
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PT I, NLDB 2024, 2024, 14762 : 166 - 180
  • [26] Embedding Transfer for Low-Resource Medical Named Entity Recognition: A Case Study on Patient Mobility
    Newman-Griffis, Denis
    Zirikly, Ayah
    SIGBIOMED WORKSHOP ON BIOMEDICAL NATURAL LANGUAGE PROCESSING (BIONLP 2018), 2018, : 1 - 11
  • [27] Named-Entity Recognition for a Low-resource Language using Pre-Trained Language Model
    Yohannes, Hailemariam Mehari
    Amagasa, Toshiyuki
    37TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2022, : 837 - 844
  • [28] Integrating prompt techniques and multi-similarity matching for named entity recognition in low-resource settings
    Yang, Jun
    Yao, Liguo
    Zhang, Taihua
    Tsai, Chieh-Yuan
    Lu, Yao
    Shen, Mingming
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 144
  • [29] ECTTLNER: An Effective Cross-Task Transferring Learning Method for Low-Resource Named Entity Recognition
    Xu, Yiwu
    Chen, Yun
    NEURAL PROCESSING LETTERS, 2025, 57 (01)
  • [30] Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data
    Jiang, Haoming
    Zhang, Danqing
    Cao, Tianyu
    Yin, Bing
    Zhao, Tuo
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 1775 - 1789