Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

被引:0
作者
Meng, Yu [1 ]
Zhang, Yunyi [1 ]
Huang, Jiaxin [1 ]
Wang, Xuan [1 ]
Zhang, Yu [1 ]
Ji, Heng [1 ]
Han, Jiawei [1 ]
机构
[1] Univ Illinois, Champaign, IL 61820 USA
来源
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021) | 2021年
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the problem of training named entity recognition (NER) models using only distantly-labeled data, which can be automatically obtained by matching entity mentions in the raw text with entity types in a knowledge base. The biggest challenge of distantlysupervised NER is that the distant supervision may induce incomplete and noisy labels, rendering the straightforward application of supervised learning ineffective. In this paper, we propose (1) a noise-robust learning scheme comprised of a new loss function and a noisy label removal step, for training NER models on distantly-labeled data, and (2) a self-training method that uses contextualized augmentations created by pre-trained language models to improve the generalization ability of the NER model. On three benchmark datasets, our method achieves superior performance, outperforming existing distantlysupervised NER models by significant margins(1).
引用
收藏
页码:10367 / 10378
页数:12
相关论文
共 50 条
[11]   Reinforcement learning based distantly supervised biomedical named entity recognition [J].
Bali, Manish ;
Anandaraj, S. P. .
INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2023, 17 (02) :317-330
[12]   Software Entity Recognition with Noise-Robust Learning [J].
Tai Nguyen ;
Di, Yifeng ;
Lee, Joohan ;
Chen, Muhao ;
Zhang, Tianyi .
2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, 2023, :484-496
[13]   Distantly Supervised Named Entity Recognition with Self-Adaptive Label Correction [J].
Nie, Binling ;
Li, Chenyang .
APPLIED SCIENCES-BASEL, 2022, 12 (15)
[14]   Distantly Supervised Named Entity Recognition using Positive-Unlabeled Learning [J].
Peng, Minlong ;
Xing, Xiaoyu ;
Zhang, Qi ;
Fu, Jinlan ;
Huang, Xuanjing .
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, :2409-2419
[15]   Improving the Robustness of Distantly-Supervised Named Entity Recognition via Uncertainty-Aware Teacher Learning and Student-Student Collaborative Learning [J].
Hu, Helan ;
Si, Shuzheng ;
Zhao, Haozhe ;
Zeng, Shuang ;
An, Kaikai ;
Cai, Zefan ;
Chang, Baobao .
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, :5533-5546
[16]   Denoising Distantly Supervised Named Entity Recognition via a Hypergeometric Probabilistic Model [J].
Zhang, Wenkai ;
Lin, Hongyu ;
Han, Xianpei ;
Sun, Le ;
Liu, Huidan ;
Wei, Zhicheng ;
Yuan, Nicholas Jing .
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 :14481-14488
[17]   A pre-training and self-training approach for biomedical named entity recognition [J].
Gao, Shang ;
Kotevska, Olivera ;
Sorokine, Alexandre ;
Christian, J. Blair .
PLOS ONE, 2021, 16 (02)
[18]   Self-training and co-training applied to Spanish Named Entity Recognition [J].
Kozareva, Z ;
Bonev, B ;
Montoyo, A .
MICAI 2005: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2005, 3789 :770-779
[19]   Learning with Noise: Improving Distantly-Supervised Fine-grained Entity Typing via Automatic Relabeling [J].
Zhang, Haoyu ;
Long, Dingkun ;
Xu, Guangwei ;
Zhu, Muhua ;
Xie, Pengjun ;
Huang, Fei ;
Wang, Ji .
PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, :3808-3815
[20]   A Self-training Approach for Few-Shot Named Entity Recognition [J].
Qian, Yudong ;
Zheng, Weiguo .
WEB AND BIG DATA, PT II, APWEB-WAIM 2022, 2023, 13422 :183-191