Towards Robust Learning with Noisy and Pseudo Labels for Text Classification

被引：3

作者：

Wen, Murtadha Ahmeda Bo ^{[1
]}

Ao, Luo ^{[1
]}

Pan, Shengfeng ^{[1
]}

Su, Jianlin ^{[1
]}

Cao, Xinxin ^{[2
]}

Liu, Yunfeng ^{[1
]}

机构：

[1] Zhuiyi AI Lab, Shenzhen, Peoples R China

[2] Northwestern Polytech Univ, Xian, Shaanxi, Peoples R China

来源：

INFORMATION SCIENCES | 2024年 / 661卷

关键词：

Natural language processing; Negative learning; Learning with noisy labels; Semi-supervised text classification;

D O I：

10.1016/j.ins.2024.120160

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Unlike Positive Training (PT), Negative Training (NT) is an indirect learning technique that trains the model on a combination of clean and noisy data using complementary labels, which are randomly generated from the label space except for the actual label. Although clean samples have identical distributions to the test samples, they are treated with the same level of uncertainty as noisy samples because of the complementary labeling of NT. Consequently, their contribution to the overall performance is relatively lower. We propose a Learning with Noisy and Pseudo Label (LNPL) framework, which jointly trains the model using PT and NT on clean and noisy data, respectively. We aim to enable direct learning on clean samples while leveraging the robustness of NT against noise in a unified framework. To mitigate the abundance of noisy instances, we leverage a gradient reversal layer at the top of LNPL as a regularization term to mislead the recognition of the source of the instance (e.g., clean or noisy). Moreover, we introduce a selftraining LNPL that performs a semi -supervised text classification task as a learning with noisy pseudo -label problem. Extensive experiments on various textual benchmark datasets demonstrate that LNPL is robust and consistently outperforms the alternatives. The code is available on GitHub.1

引用

页数：14

共 50 条

[1] Is BERT Robust to Label Noise? A Study on Learning with Noisy Labels in Text Classification
Zhu, Dawei
Hedderich, Michael A.
Zhai, Fangzhou
Adelani, David Ifeoluwa
Klakow, Dietrich
PROCEEDINGS OF THE THIRD WORKSHOP ON INSIGHTS FROM NEGATIVE RESULTS IN NLP (INSIGHTS 2022), 2022, : 62 - 67
[2] RoMo: Robust Unsupervised Multimodal Learning With Noisy Pseudo Labels
Li, Yongxiang
Qin, Yang
Sun, Yuan
Peng, Dezhong
Peng, Xi
Hu, Peng
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 5086 - 5097
[3] Towards harnessing feature embedding for robust learning with noisy labels
Zhang, Chuang
Shen, Li
Yang, Jian
Gong, Chen
MACHINE LEARNING, 2022, 111 (09) : 3181 - 3201
[4] Towards harnessing feature embedding for robust learning with noisy labels
Chuang Zhang
Li Shen
Jian Yang
Chen Gong
Machine Learning, 2022, 111 : 3181 - 3201
[5] Learning with Confident Examples: Rank Pruning for Robust Classification with Noisy Labels
Northcutt, Curtis G.
Wu, Tailin
Chuang, Isaac L.
CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI2017), 2017,
[6] Distributionally Robust Federated Learning for Network Traffic Classification With Noisy Labels
Shi, Siping
Guo, Yingya
Wang, Dan
Zhu, Yifei
Han, Zhu
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (05) : 6212 - 6226
[7] Robust Federated Learning With Noisy Labels
Yang, Seunghan
Park, Hyoungseob
Byun, Junyoung
Kim, Changick
IEEE INTELLIGENT SYSTEMS, 2022, 37 (02) : 35 - 43
[8] Robust Collaborative Learning with Noisy Labels
Sun, Mengying
Xing, Jing
Chen, Bin
Zhou, Jiayu
20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2020), 2020, : 1274 - 1279
[9] Learning to rectify for robust learning with noisy labels
Sun, Haoliang
Guo, Chenhui
Wei, Qi
Han, Zhongyi
Yin, Yilong
PATTERN RECOGNITION, 2022, 124
[10] DEEP LEARNING CLASSIFICATION WITH NOISY LABELS
Sanchez, Guillaume
Guis, Vincente
Marxer, Ricard
Bouchara, Frederic
2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2020,

← 1 2 3 4 5 →