Improving the Robustness of Distantly-Supervised Named Entity Recognition via Uncertainty-Aware Teacher Learning and Student-Student Collaborative Learning

被引:0
作者
Hu, Helan [1 ,2 ]
Si, Shuzheng [1 ,2 ]
Zhao, Haozhe [1 ,2 ]
Zeng, Shuang [3 ]
An, Kaikai [1 ,2 ]
Cai, Zefan [1 ,2 ]
Chang, Baobao [1 ,4 ]
机构
[1] Peking Univ, Natl Key Lab Multimedia Informat Proc, Beijing, Peoples R China
[2] Peking Univ, Sch Software & Microelect, Beijing, Peoples R China
[3] Tencent Inc, Shenzhen, Peoples R China
[4] Jiangsu Normal Univ, Jiangsu Collaborat Innovat Ctr Language Abil, Xuzhou, Peoples R China
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024 | 2024年
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Distantly-Supervised Named Entity Recognition (DS-NER) effectively alleviates the burden of annotation, but meanwhile suffers from the label noise. Recent works attempt to adopt the teacher-student framework to gradually refine the training labels and improve the overall robustness. However, we argue that these teacher-student methods achieve limited performance because the poor calibration of the teacher network produces incorrectly pseudo-labeled samples, leading to error propagation. Therefore, we attempt to mitigate this issue by proposing: (1) Uncertainty-Aware Teacher Learning that leverages the prediction uncertainty to reduce the number of incorrect pseudo labels in the self-training stage; (2) Student-Student Collaborative Learning that allows the transfer of reliable labels between two student networks instead of indiscriminately relying on all pseudo labels from its teacher, and further enables a full exploration of mislabeled samples rather than simply filtering unreliable pseudo-labeled samples. We evaluate our proposed method on five DS-NER datasets, demonstrating that our method is superior to the state-of-the-art DS-NER denoising methods.
引用
收藏
页码:5533 / 5546
页数:14
相关论文
共 36 条
[1]  
[Anonymous], 2009, P 2009 WORKSH PEOPL
[2]  
Arpit D, 2017, PR MACH LEARN RES, V70
[3]  
Cao YX, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P261
[4]  
Feng JZ, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P3805
[5]  
Godin F., 2015, P WORKSHOP NOISY USE, P146, DOI DOI 10.18653/V1/W15-4322
[6]  
Guo CA, 2017, PR MACH LEARN RES, V70
[7]   Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels [J].
Han, Bo ;
Yao, Quanming ;
Yu, Xingrui ;
Niu, Gang ;
Xu, Miao ;
Hu, Weihua ;
Tsang, Ivor W. ;
Sugiyama, Masashi .
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[8]   ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised Image Segmentation [J].
Huo, Xinyue ;
Xie, Lingxi ;
He, Jianzhong ;
Yang, Zijie ;
Zhou, Wengang ;
Li, Houqiang ;
Tian, Qi .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :1235-1244
[9]  
Jie Z, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P729
[10]  
Kingma Diederik P., 2015, 3 INT C LEARN REPR, DOI 10.48550/arXiv.1412.6980