Dual Pseudo Supervision for Semi-Supervised Text Classification with a Reliable Teacher

被引:4
|
作者
Li, Shujie [1 ,5 ]
Yang, Min [2 ]
Li, Chengming [3 ]
Xu, Ruifeng [4 ]
机构
[1] Univ Sci & Technol China, Hefei, Peoples R China
[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen, Peoples R China
[3] Sun Yat Sen Univ, Sch Intelligent Syst Engn, Shenzhen, Peoples R China
[4] Harbin Inst Technol, Peng Cheng Lab, Shenzhen, Peoples R China
[5] Chinese Acad Sci, SIAT, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Semi-supervised text classification; Pseudo labeling; Meta Learning; Consistency regularization;
D O I
10.1145/3477495.3531887
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we study the semi-supervised text classification (SSTC) by exploring both labeled and extra unlabeled data. One of the most popular SSTC techniques is pseudo-labeling which assigns pseudo labels for unlabeled data via a teacher classifier trained on labeled data. These pseudo labeled data is then applied to train a student classifier. However, when the pseudo labels are inaccurate, the student classifier will learn from inaccurate data and get even worse performance than the teacher. To mitigate this issue, we propose a simple yet efficient pseudo-labeling framework called Dual Pseudo Supervision (DPS), which exploits the feedback signal from the student to guide the teacher to generate better pseudo labels. In particular, we alternately update the student based on the pseudo labeled data annotated by the teacher and optimize the teacher based on the student's performance via meta learning. In addition, we also design a consistency regularization term to further improve the stability of the teacher. With the above two strategies, the learned reliable teacher can provide more accurate pseudo-labels to the student and thus improve the overall performance of text classification. We conduct extensive experiments on three benchmark datasets (i.e., AG News, Yelp and Yahoo) to verify the effectiveness of our DPS method. Experimental results show that our approach achieves substantially better performance than the strong competitors. For reproducibility, we will release our code and data of this paper publicly at https://github.com/GRIT621/DPS.
引用
收藏
页码:2513 / 2518
页数:6
相关论文
共 50 条
  • [21] Different Similarity Measures in Semi-supervised Text Classification
    Wajeed, Mohammed Abdul
    Adilakshmi, T.
    2011 ANNUAL IEEE INDIA CONFERENCE (INDICON-2011): ENGINEERING SUSTAINABLE SOLUTIONS, 2011,
  • [22] Use of Distributed Semi-Supervised Clustering for Text Classification
    Li, Pei
    Deng, Ze
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2019, 28 (08)
  • [23] Semi-supervised text classification using partitioned EM
    Cong, G
    Lee, WS
    Wu, HR
    Liu, B
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2004, 2973 : 482 - 493
  • [24] Keyword-Based Semi-Supervised Text Classification
    Severin, Karl
    Gokhale, Swapna S.
    Dagnino, Aldo
    2019 IEEE 43RD ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 1, 2019, : 417 - 422
  • [25] Semi-supervised Text Classification Using RBF Networks
    Jiang, Eric P.
    ADVANCES IN INTELLIGENT DATA ANALYSIS VIII, PROCEEDINGS, 2009, 5772 : 95 - 106
  • [26] JointMatch: A Unified Approach for Diverse and Collaborative Pseudo-Labeling to Semi-Supervised Text Classification
    Zou, Henry Peng
    Caragea, Cornelia
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 7290 - 7301
  • [27] SEMI-SUPERVISED LEARNING FOR TEXT CLASSIFICATION BY LAYER PARTITIONING
    Li, Alexander Hanbo
    Sethy, Abhinav
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6164 - 6168
  • [28] Competing Dual-Network with Pseudo-Supervision Rectification for Semi-Supervised Medical Image Segmentation
    Zhou, Ping
    Chen, Feng
    Li, Bingwen
    Tang, Zhen
    Liu, Heng
    Du, Meiyu
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XIV, 2025, 15044 : 545 - 559
  • [29] Semi-supervised Short Text Classification Based On Dual-channel Data Augmentation
    Li, Jiajun
    Li, Peipei
    Hu, Xuegang
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [30] Reliable Semi-supervised Learning
    Shao, Junming
    Huang, Chen
    Yang, Qinli
    Luo, Guangchun
    2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2016, : 1197 - 1202