DISCo: Distilled Student Models Co-training for Semi-supervised Text Mining

被引:0
|
作者
Jiang, Weifeng [1 ,2 ]
Mao, Qianren [2 ]
Lin, Chenghua [3 ]
Li, Jianxin [2 ,4 ]
Deng, Ting [4 ]
Yang, Weiyi [4 ]
Wang, Zheng [5 ]
机构
[1] Nanyang Technol Univ, SCSE, Singapore, Singapore
[2] Zhongguancun Lab, Beijing, Peoples R China
[3] Univ Manchester, Dept Comp Sci, Manchester, Lancs, England
[4] Beihang Univ, Sch Comp Sci & Engn, Beijing, Peoples R China
[5] Univ Leeds, Sch Comp, Leeds, W Yorkshire, England
来源
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023 | 2023年
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many text mining models are constructed by fine-tuning a large deep pre-trained language model (PLM) in downstream tasks. However, a significant challenge nowadays is maintaining performance when we use a lightweight model with limited labelled samples. We present DisCo, a semi-supervised learning (SSL) framework for fine-tuning a cohort of small student models generated from a large PLM using knowledge distillation. Our key insight is to share complementary knowledge among distilled student cohorts to promote their SSL effectiveness. DisCo employs a novel co-training technique to optimize a cohort of multiple small student models by promoting knowledge sharing among students under diversified views: model views produced by different distillation strategies and data views produced by various input augmentations. We evaluate DisCo on both semi-supervised text classification and extractive summarization tasks. Experimental results show that DisCo can produce student models that are 7.6x smaller and 4.8x faster in inference than the baseline PLMs while maintaining comparable performance. We also show that DisCo-generated student models outperform the similar-sized models elaborately tuned in distinct tasks.
引用
收藏
页码:4015 / 4030
页数:16
相关论文
共 50 条
  • [21] Inductive Semi-supervised Multi-Label Learning with Co-Training
    Zhan, Wang
    Zhang, Min-Ling
    KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, : 1305 - 1314
  • [22] Semi-Supervised Learning of Alternatively Spliced Exons Using Co-Training
    Tangirala, Karthik
    Caragea, Doina
    2011 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM 2011), 2011, : 243 - 246
  • [23] Co-training semi-supervised active learning algorithm with noise filter
    School of Computer Science and Telecommunication Engineering, Jiangsu University, Zhenjiang 212013, China
    Moshi Shibie yu Rengong Zhineng, 2009, 5 (750-755):
  • [24] A Co-training Based Semi-supervised Human Action Recognition Algorithm
    Yuan, Hejin
    Wang, Cuiru
    Liu, Jun
    MANUFACTURING SYSTEMS AND INDUSTRY APPLICATIONS, 2011, 267 : 1065 - 1070
  • [25] Stacked co-training for semi-supervised multi-label learning
    Li, Jiaxuan
    Zhu, Xiaoyan
    Wang, Hongrui
    Zhang, Yu
    Wang, Jiayin
    INFORMATION SCIENCES, 2024, 677
  • [26] Safe Multi-view Co-training for Semi-supervised Regression
    Liu, Li Yan
    Huang, Peng
    Min, Fan
    2022 IEEE 9TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2022, : 56 - 65
  • [27] Co-training generative adversarial networks for semi-supervised classification method
    Xu, Zhe
    Geng, Jie
    Jiang, Wen
    Zhang, Zhuo
    Zeng, Qing-Jie
    Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2021, 29 (05): : 1127 - 1135
  • [28] A semi-supervised extreme learning machine method based on co-training
    Li, Kunlun
    Zhang, Juan
    Xu, Hongyu
    Luo, Shangzong
    Li, Hexin
    Journal of Computational Information Systems, 2013, 9 (01): : 207 - 214
  • [29] Co-Training with Validation: A Generic Framework for Semi-Supervised Relation Extraction
    Zhang, Shun
    Lu, Xiangkui
    Wu, Jun
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 4697 - 4701
  • [30] Co-training with Clustering for the Semi-supervised Classification of Remote Sensing Images
    Aydav, Prem Shankar Singh
    Minz, Sonjharia
    PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION TECHNOLOGIES, IC3T 2015, VOL 2, 2016, 380 : 659 - 667