Improving Low-resource Named Entity Recognition with Graph Propagated Data Augmentation

被引:0
|
作者
Cai, Jiong [1 ]
Huang, Shen [3 ]
Jiang, Yong [3 ]
Tan, Zeqi [2 ]
Xie, Pengjun [3 ]
Tu, Kewei [1 ]
机构
[1] Univ Chinese Acad Sci, Sch Informat Sci & Technol, Shanghai Engn Res Ctr Intelligent Vis & Imagin, Shanghai Inst Microsyst & Informat Technol,Shangh, Beijing, Peoples R China
[2] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Zhejiang, Peoples R China
[3] Alibaba Grp, DAMO Acad, Hangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data augmentation is an effective solution to improve model performance and robustness for low-resource named entity recognition (NER). However, synthetic data often suffer from poor diversity, which leads to performance limitations. In this paper, we propose a novel Graph Propagated Data Augmentation (GPDA) framework for Named Entity Recognition (NER), leveraging graph propagation to build relationships between labeled data and unlabeled natural texts. By projecting the annotations from the labeled text to the unlabeled text, the unlabeled texts are partially labeled, which has more diversity rather than synthetic annotated data. To strengthen the propagation precision, a simple search engine built on Wikipedia is utilized to fetch related texts of labeled data and to propagate the entity labels to them in the light of the anchor links. Besides, we construct and perform experiments on a real-world low-resource dataset of the E-commerce domain, which will be publicly available to facilitate the low-resource NER research. Experimental results show that GPDA presents substantial improvements over previous data augmentation methods on multiple low-resource NER datasets.
引用
收藏
页码:110 / 118
页数:9
相关论文
共 50 条
  • [1] Exogenous and Endogenous Data Augmentation for Low-Resource Complex Named Entity Recognition
    Zhang, Xinghua
    Chen, Gaode
    Cui, Shiyao
    Sheng, Jiawei
    Liu, Tingwen
    Xu, Hongbo
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 630 - 640
  • [2] Constrained Labeled Data Generation for Low-Resource Named Entity Recognition
    Guo, Ruohao
    Roth, Dan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4519 - 4533
  • [3] AUC Maximization for Low-Resource Named Entity Recognition
    Nguyen, Ngoc Dang
    Tan, Wei
    Du, Lan
    Buntine, Wray
    Beare, Richard
    Chen, Changyou
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 13389 - 13399
  • [4] Enhancement of Named Entity Recognition in Low-Resource Languages with Data Augmentation and BERT Models: A Case Study on Urdu
    Ullah, Fida
    Gelbukh, Alexander
    Zamir, Muhammad Tayyab
    Riveron, Edgardo Manuel Felipe
    Sidorov, Grigori
    COMPUTERS, 2024, 13 (10)
  • [5] Biomedical Named Entity Recognition Under Low-Resource Situation
    Zhao, Jianfei
    Ren, Xiangyu
    Zhao, Shuo
    Li, Jinyi
    HEALTH INFORMATION PROCESSING. EVALUATION TRACK PAPERS, 2023, 1773 : 41 - 47
  • [6] 3Rs:Data Augmentation Techniques Using Document Contexts For Low-Resource Chinese Named Entity Recognition
    Ying, Zheyu
    Zhang, Jinglei
    Xie, Rui
    Wen, Guochang
    Xiao, Feng
    Liu, Xueyang
    Zhang, Shikun
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [7] Improving Named Entity Recognition for Social Media with Data Augmentation
    Liu, Wenzhong
    Cui, Xiaohui
    APPLIED SCIENCES-BASEL, 2023, 13 (09):
  • [8] RoPDA: Robust Prompt -Based Data Augmentation for Low -Resource Named Entity Recognition
    Song, Sihan
    Shen, Furao
    Zhao, Jian
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 19017 - 19025
  • [9] Dual Adversarial Neural Transfer for Low-Resource Named Entity Recognition
    Zhou, Joey Tianyi
    Zhang, Hao
    Jin, Di
    Zhu, Hongyuan
    Fang, Meng
    Goh, Rick Siow Mong
    Kwok, Kenneth
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3461 - 3471
  • [10] Converse Attention Knowledge Transfer for Low-Resource Named Entity Recognition
    School of Computer Science and Technology, University of Science and Technology of China, Hefei
    230027, China
    不详
    639798, Singapore
    Int. J. Crowd. Sci., 2024, 3 (140-148):