A review of semi-supervised learning for text classification

被引:30
|
作者
Duarte, Jose Marcio [1 ]
Berton, Lilian [1 ]
机构
[1] Univ Fed Sao Paulo, Sci & Technol Dept, Cesare Mansueto Giulio Lattes Ave 1201, BR-12247014 Sao Jose Dos Campos, SP, Brazil
关键词
Natural language processing; Text classification; Machine learning; Semi-supervised learning; SENTIMENT ANALYSIS; INFORMATION; SELECTION;
D O I
10.1007/s10462-023-10393-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A huge amount of data is generated daily leading to big data challenges. One of them is related to text mining, especially text classification. To perform this task we usually need a large set of labeled data that can be expensive, time-consuming, or difficult to be obtained. Considering this scenario semi-supervised learning (SSL), the branch of machine learning concerned with using labeled and unlabeled data has expanded in volume and scope. Since no recent survey exists to overview how SSL has been used in text classification, we aim to fill this gap and present an up-to-date review of SSL for text classification. We retrieve 1794 works from the last 5 years from IEEE Xplore, ACM Digital Library, Science Direct, and Springer. Then, 157 articles were selected to be included in this review. We present the application domain, datasets, and languages employed in the works. The text representations and machine learning algorithms. We also summarize and organize the works following a recent taxonomy of SSL. We analyze the percentage of labeled data used, the evaluation metrics, and obtained results. Lastly, we present some limitations and future trends in the area. We aim to provide researchers and practitioners with an outline of the area as well as useful information for their current research.
引用
收藏
页码:9401 / 9469
页数:69
相关论文
共 50 条
  • [31] Semi-supervised tensor learning for image classification
    Jianguang Zhang
    Yahong Han
    Jianmin Jiang
    Multimedia Systems, 2017, 23 : 63 - 73
  • [32] VideoSSL: Semi-Supervised Learning for Video Classification
    Jing, Longlong
    Parag, Toufiq
    Wu, Zhe
    Tian, Yingli
    Wang, Hongcheng
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1109 - 1118
  • [33] Semi-supervised Learning for Image Modality Classification
    de Herrera, Alba Garcia Seco
    Markonis, Dimitrios
    Joyseeree, Ranveer
    Schaer, Roger
    Foncubierta-Rodriguez, Antonio
    Mueller, Henning
    MULTIMODAL RETRIEVAL IN THE MEDICAL DOMAIN, MRMD 2015, 2015, 9059 : 85 - 98
  • [34] Semi-Supervised Classification Based on Transformed Learning
    Kang Z.
    Liu L.
    Han M.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (01): : 103 - 111
  • [35] Safe semi-supervised learning for pattern classification
    Ma, Jun
    Yu, Guolin
    Xiong, Weizhi
    Zhu, Xiaolong
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 121
  • [36] Semi-supervised learning for question classification in CQA
    Li, Yiyang
    Su, Lei
    Chen, Jun
    Yuan, Liwei
    NATURAL COMPUTING, 2017, 16 (04) : 567 - 577
  • [37] Integrated Semi-Supervised Model for Learning and Classification
    Bhalla, Vandna
    Chaudhury, Santanu
    PROCEEDINGS OF 3RD INTERNATIONAL CONFERENCE ON COMPUTER VISION AND IMAGE PROCESSING, CVIP 2018, VOL 1, 2020, 1022 : 183 - 195
  • [38] SEMI-SUPERVISED LEARNING FOR MARS IMAGERY CLASSIFICATION
    Wang, Wenjing
    Lin, Lilang
    Fan, Zejia
    Liu, Baying
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 499 - 503
  • [39] Multimodal semi-supervised learning for image classification
    Guillaumin, Matthieu
    Verbeek, Jakob
    Schmid, Cordelia
    2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 902 - 909
  • [40] Semi-supervised learning for photometric supernova classification
    Richards, Joseph W.
    Homrighausen, Darren
    Freeman, Peter E.
    Schafer, Chad M.
    Poznanski, Dovi
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2012, 419 (02) : 1121 - 1135