Dual Pseudo Supervision for Semi-Supervised Text Classification with a Reliable Teacher

被引:4
|
作者
Li, Shujie [1 ,5 ]
Yang, Min [2 ]
Li, Chengming [3 ]
Xu, Ruifeng [4 ]
机构
[1] Univ Sci & Technol China, Hefei, Peoples R China
[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen, Peoples R China
[3] Sun Yat Sen Univ, Sch Intelligent Syst Engn, Shenzhen, Peoples R China
[4] Harbin Inst Technol, Peng Cheng Lab, Shenzhen, Peoples R China
[5] Chinese Acad Sci, SIAT, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Semi-supervised text classification; Pseudo labeling; Meta Learning; Consistency regularization;
D O I
10.1145/3477495.3531887
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we study the semi-supervised text classification (SSTC) by exploring both labeled and extra unlabeled data. One of the most popular SSTC techniques is pseudo-labeling which assigns pseudo labels for unlabeled data via a teacher classifier trained on labeled data. These pseudo labeled data is then applied to train a student classifier. However, when the pseudo labels are inaccurate, the student classifier will learn from inaccurate data and get even worse performance than the teacher. To mitigate this issue, we propose a simple yet efficient pseudo-labeling framework called Dual Pseudo Supervision (DPS), which exploits the feedback signal from the student to guide the teacher to generate better pseudo labels. In particular, we alternately update the student based on the pseudo labeled data annotated by the teacher and optimize the teacher based on the student's performance via meta learning. In addition, we also design a consistency regularization term to further improve the stability of the teacher. With the above two strategies, the learned reliable teacher can provide more accurate pseudo-labels to the student and thus improve the overall performance of text classification. We conduct extensive experiments on three benchmark datasets (i.e., AG News, Yelp and Yahoo) to verify the effectiveness of our DPS method. Experimental results show that our approach achieves substantially better performance than the strong competitors. For reproducibility, we will release our code and data of this paper publicly at https://github.com/GRIT621/DPS.
引用
收藏
页码:2513 / 2518
页数:6
相关论文
共 50 条
  • [41] A genetic semi-supervised fuzzy clustering approach to text classification
    Liu, H
    Huang, ST
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2003, 2762 : 173 - 180
  • [42] Semi-Supervised Text Classification via Self-Pretraining
    Karisani, Payam
    Karisani, Negin
    WSDM '21: PROCEEDINGS OF THE 14TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2021, : 40 - 48
  • [43] TESC: An approach to TExt classification using Semi-supervised Clustering
    Zhang, Wen
    Tang, Xijin
    Yoshida, Taketoshi
    KNOWLEDGE-BASED SYSTEMS, 2015, 75 : 152 - 160
  • [44] Semi-Supervised Text Classification with Balanced Deep Representation Distributions
    Li, Changchun
    Li, Ximing
    Ouyang, Jihong
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 5044 - 5053
  • [45] Automatic Bug Triage using Semi-Supervised Text Classification
    Xuan, Jifeng
    Jiang, He
    Ren, Zhilei
    Yan, Jun
    Luo, Zhongxuan
    22ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING & KNOWLEDGE ENGINEERING (SEKE 2010), 2010, : 209 - 214
  • [46] Semi-supervised text classification using positive and unlabeled data
    Yu, Shuang
    Zhou, Xueyuan
    Li, Chunping
    ADVANCES IN INTELLIGENT IT: ACTIVE MEDIA TECHNOLOGY 2006, 2006, 138 : 249 - 254
  • [47] Progressive Class Semantic Matching for Semi-supervised Text Classification
    Xu, Hai-Ming
    Liu, Lingqiao
    Abbasnejad, Ehsan
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 3003 - 3013
  • [48] Efficient Path Prediction for Semi-Supervised and Weakly Supervised Hierarchical Text Classification
    Xiao, Huiru
    Liu, Xin
    Song, Yangqiu
    WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, : 3370 - 3376
  • [49] Graph-based Semi-supervised Learning for Text Classification
    Widmann, Natalie
    Verberne, Suzan
    ICTIR'17: PROCEEDINGS OF THE 2017 ACM SIGIR INTERNATIONAL CONFERENCE THEORY OF INFORMATION RETRIEVAL, 2017, : 59 - 66
  • [50] Text Classification using Semi-supervised Approach for Multi Domain
    Deshmukh, Jyoti S.
    Tripathy, Amiya Kumar
    2017 INTERNATIONAL CONFERENCE ON NASCENT TECHNOLOGIES IN ENGINEERING (ICNTE-2017), 2017,