Towards Robust Learning with Noisy and Pseudo Labels for Text Classification

被引:3
|
作者
Wen, Murtadha Ahmeda Bo [1 ]
Ao, Luo [1 ]
Pan, Shengfeng [1 ]
Su, Jianlin [1 ]
Cao, Xinxin [2 ]
Liu, Yunfeng [1 ]
机构
[1] Zhuiyi AI Lab, Shenzhen, Peoples R China
[2] Northwestern Polytech Univ, Xian, Shaanxi, Peoples R China
关键词
Natural language processing; Negative learning; Learning with noisy labels; Semi-supervised text classification;
D O I
10.1016/j.ins.2024.120160
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Unlike Positive Training (PT), Negative Training (NT) is an indirect learning technique that trains the model on a combination of clean and noisy data using complementary labels, which are randomly generated from the label space except for the actual label. Although clean samples have identical distributions to the test samples, they are treated with the same level of uncertainty as noisy samples because of the complementary labeling of NT. Consequently, their contribution to the overall performance is relatively lower. We propose a Learning with Noisy and Pseudo Label (LNPL) framework, which jointly trains the model using PT and NT on clean and noisy data, respectively. We aim to enable direct learning on clean samples while leveraging the robustness of NT against noise in a unified framework. To mitigate the abundance of noisy instances, we leverage a gradient reversal layer at the top of LNPL as a regularization term to mislead the recognition of the source of the instance (e.g., clean or noisy). Moreover, we introduce a selftraining LNPL that performs a semi -supervised text classification task as a learning with noisy pseudo -label problem. Extensive experiments on various textual benchmark datasets demonstrate that LNPL is robust and consistently outperforms the alternatives. The code is available on GitHub.1
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Is BERT Robust to Label Noise? A Study on Learning with Noisy Labels in Text Classification
    Zhu, Dawei
    Hedderich, Michael A.
    Zhai, Fangzhou
    Adelani, David Ifeoluwa
    Klakow, Dietrich
    PROCEEDINGS OF THE THIRD WORKSHOP ON INSIGHTS FROM NEGATIVE RESULTS IN NLP (INSIGHTS 2022), 2022, : 62 - 67
  • [2] RoMo: Robust Unsupervised Multimodal Learning With Noisy Pseudo Labels
    Li, Yongxiang
    Qin, Yang
    Sun, Yuan
    Peng, Dezhong
    Peng, Xi
    Hu, Peng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 5086 - 5097
  • [3] Towards harnessing feature embedding for robust learning with noisy labels
    Zhang, Chuang
    Shen, Li
    Yang, Jian
    Gong, Chen
    MACHINE LEARNING, 2022, 111 (09) : 3181 - 3201
  • [4] Towards harnessing feature embedding for robust learning with noisy labels
    Chuang Zhang
    Li Shen
    Jian Yang
    Chen Gong
    Machine Learning, 2022, 111 : 3181 - 3201
  • [5] Learning with Confident Examples: Rank Pruning for Robust Classification with Noisy Labels
    Northcutt, Curtis G.
    Wu, Tailin
    Chuang, Isaac L.
    CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI2017), 2017,
  • [6] Distributionally Robust Federated Learning for Network Traffic Classification With Noisy Labels
    Shi, Siping
    Guo, Yingya
    Wang, Dan
    Zhu, Yifei
    Han, Zhu
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (05) : 6212 - 6226
  • [7] Robust Federated Learning With Noisy Labels
    Yang, Seunghan
    Park, Hyoungseob
    Byun, Junyoung
    Kim, Changick
    IEEE INTELLIGENT SYSTEMS, 2022, 37 (02) : 35 - 43
  • [8] Robust Collaborative Learning with Noisy Labels
    Sun, Mengying
    Xing, Jing
    Chen, Bin
    Zhou, Jiayu
    20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2020), 2020, : 1274 - 1279
  • [9] Learning to rectify for robust learning with noisy labels
    Sun, Haoliang
    Guo, Chenhui
    Wei, Qi
    Han, Zhongyi
    Yin, Yilong
    PATTERN RECOGNITION, 2022, 124
  • [10] DEEP LEARNING CLASSIFICATION WITH NOISY LABELS
    Sanchez, Guillaume
    Guis, Vincente
    Marxer, Ricard
    Bouchara, Frederic
    2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2020,