DISCo: Distilled Student Models Co-training for Semi-supervised Text Mining

被引:0
|
作者
Jiang, Weifeng [1 ,2 ]
Mao, Qianren [2 ]
Lin, Chenghua [3 ]
Li, Jianxin [2 ,4 ]
Deng, Ting [4 ]
Yang, Weiyi [4 ]
Wang, Zheng [5 ]
机构
[1] Nanyang Technol Univ, SCSE, Singapore, Singapore
[2] Zhongguancun Lab, Beijing, Peoples R China
[3] Univ Manchester, Dept Comp Sci, Manchester, Lancs, England
[4] Beihang Univ, Sch Comp Sci & Engn, Beijing, Peoples R China
[5] Univ Leeds, Sch Comp, Leeds, W Yorkshire, England
来源
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023 | 2023年
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many text mining models are constructed by fine-tuning a large deep pre-trained language model (PLM) in downstream tasks. However, a significant challenge nowadays is maintaining performance when we use a lightweight model with limited labelled samples. We present DisCo, a semi-supervised learning (SSL) framework for fine-tuning a cohort of small student models generated from a large PLM using knowledge distillation. Our key insight is to share complementary knowledge among distilled student cohorts to promote their SSL effectiveness. DisCo employs a novel co-training technique to optimize a cohort of multiple small student models by promoting knowledge sharing among students under diversified views: model views produced by different distillation strategies and data views produced by various input augmentations. We evaluate DisCo on both semi-supervised text classification and extractive summarization tasks. Experimental results show that DisCo can produce student models that are 7.6x smaller and 4.8x faster in inference than the baseline PLMs while maintaining comparable performance. We also show that DisCo-generated student models outperform the similar-sized models elaborately tuned in distinct tasks.
引用
收藏
页码:4015 / 4030
页数:16
相关论文
共 50 条
  • [41] Root-Cause Analysis with Semi-Supervised Co-Training for Integrated Systems
    Pan, Renjian
    Li, Xin
    Chakrabarty, Krishnendu
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2024, 29 (03)
  • [42] Co-Training Semi-Supervised Active Learning Algorithm based on Noise Filter
    Chen Ya-bi
    Zhan Yong-zhao
    PROCEEDINGS OF THE 2009 WRI GLOBAL CONGRESS ON INTELLIGENT SYSTEMS, VOL III, 2009, : 524 - 528
  • [43] Three-Way Co-Training with Pseudo Labels for Semi-Supervised Learning
    Wang, Liuxin
    Gao, Can
    Zhou, Jie
    Wen, Jiajun
    MATHEMATICS, 2023, 11 (15)
  • [44] Multi-Label Learning with Co-Training Based on Semi-Supervised Regression
    Xu, Meixiang
    Sun, Fuming
    Jiang, Xiaojun
    2014 INTERNATIONAL CONFERENCE ON SECURITY, PATTERN ANALYSIS, AND CYBERNETICS (SPAC), 2014, : 175 - 180
  • [45] Temporal-Frequency Co-training for Time Series Semi-supervised Learning
    Liu, Zhen
    Ma, Qianli
    Ma, Peitian
    Wang, Linghao
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 7, 2023, : 8923 - 8931
  • [46] Fine-Tuning Language Models For Semi-Supervised Text Mining
    Chen, Xinyu
    Beaver, Ian
    Freeman, Cynthia
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 3608 - 3617
  • [47] An Efficient Approach to Select Instances in Self-Training and Co-Training Semi-Supervised Methods
    Ovidio Vale, Karliane Medeiros
    Gorgonio, Arthur Costa
    Gorgonio, Flavius Da Luz E.
    De Paula Canuto, Anne Magaly
    IEEE ACCESS, 2022, 10 : 7254 - 7276
  • [48] Learning Adaptive Semi-Supervised Multi-Output Soft-Sensors With Co-Training of Heterogeneous Models
    Li, Dong
    Huang, Daoping
    Yu, Guangping
    Liu, Yiqi
    IEEE ACCESS, 2020, 8 : 46493 - 46504
  • [49] Co-Training Semi-Supervised Deep Learning for Sentiment Classification of MOOC Forum Posts
    Chen, Jing
    Feng, Jun
    Sun, Xia
    Liu, Yang
    SYMMETRY-BASEL, 2020, 12 (01):
  • [50] When less is more: on the value of “co-training” for semi-supervised software defect predictors
    Suvodeep Majumder
    Joymallya Chakraborty
    Tim Menzies
    Empirical Software Engineering, 2024, 29