Improving the Robustness of Distantly-Supervised Named Entity Recognition via Uncertainty-Aware Teacher Learning and Student-Student Collaborative Learning

被引：0

作者：

Hu, Helan ^{[1
,2
]}

Si, Shuzheng ^{[1
,2
]}

Zhao, Haozhe ^{[1
,2
]}

Zeng, Shuang ^{[3
]}

An, Kaikai ^{[1
,2
]}

Cai, Zefan ^{[1
,2
]}

Chang, Baobao ^{[1
,4
]}

机构：

[1] Peking Univ, Natl Key Lab Multimedia Informat Proc, Beijing, Peoples R China

[2] Peking Univ, Sch Software & Microelect, Beijing, Peoples R China

[3] Tencent Inc, Shenzhen, Peoples R China

[4] Jiangsu Normal Univ, Jiangsu Collaborat Innovat Ctr Language Abil, Xuzhou, Peoples R China

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024 | 2024年

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Distantly-Supervised Named Entity Recognition (DS-NER) effectively alleviates the burden of annotation, but meanwhile suffers from the label noise. Recent works attempt to adopt the teacher-student framework to gradually refine the training labels and improve the overall robustness. However, we argue that these teacher-student methods achieve limited performance because the poor calibration of the teacher network produces incorrectly pseudo-labeled samples, leading to error propagation. Therefore, we attempt to mitigate this issue by proposing: (1) Uncertainty-Aware Teacher Learning that leverages the prediction uncertainty to reduce the number of incorrect pseudo labels in the self-training stage; (2) Student-Student Collaborative Learning that allows the transfer of reliable labels between two student networks instead of indiscriminately relying on all pseudo labels from its teacher, and further enables a full exploration of mislabeled samples rather than simply filtering unreliable pseudo-labeled samples. We evaluate our proposed method on five DS-NER datasets, demonstrating that our method is superior to the state-of-the-art DS-NER denoising methods.

引用

页码：5533 / 5546

页数：14

共 36 条

[1]

[Anonymous], 2009, P 2009 WORKSH PEOPL

[2]

Arpit D, 2017, PR MACH LEARN RES, V70

[3]

Cao YX, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P261

[4]

Feng JZ, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P3805

[5]

Godin F., 2015, P WORKSHOP NOISY USE, P146, DOI DOI 10.18653/V1/W15-4322

[6]

Guo CA, 2017, PR MACH LEARN RES, V70

[7] Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels [J].

Han, Bo ;

Yao, Quanming ;

Yu, Xingrui ;

Niu, Gang ;

Xu, Miao ;

Hu, Weihua ;

Tsang, Ivor W. ;

Sugiyama, Masashi .

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31

[8] ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised Image Segmentation [J].

Huo, Xinyue ;

Xie, Lingxi ;

He, Jianzhong ;

Yang, Zijie ;

Zhou, Wengang ;

Li, Houqiang ;

Tian, Qi .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :1235-1244

[9]

Jie Z, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P729

[10]

Kingma Diederik P., 2015, 3 INT C LEARN REPR, DOI 10.48550/arXiv.1412.6980

← 1 2 3 4 →