A survey of text classification based on pre-trained language model

被引：1

作者：

Wu, Yujia ^{[1
,2
]}

Wan, Jun ^{[3
]}

机构：

[1] Sanda Univ, Sch Informat Sci & Technol, Shanghai 201209, Peoples R China

[2] Wuhan Univ, Sch Comp Sci, Wuhuan 430072, Peoples R China

[3] Zhongnan Univ Econ & Law, Sch Informat & Safety Engn, Wuhan 430073, Peoples R China

来源：

NEUROCOMPUTING | 2025年 / 616卷

关键词：

Machine learning; Deep learning; Neural networks; Natural language processing; Text classification; Transformer; Pre-trained language models; CAPSULE NETWORKS;

D O I：

10.1016/j.neucom.2024.128921

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The utilization of text classification is widespread within the domain of Natural Language Processing (NLP). In recent years, pre-trained language models (PLMs) based on the Transformer architecture have made significant strides across various artificial intelligence tasks. Currently, text classification employing PLMs has emerged as a prominent research focus within NLP. While several review papers examine text classification and Transformer models, there is a notable lack of comprehensive surveys specifically addressing text classification grounded in PLMs. To address this gap, the present survey provides an extensive overview of text classification techniques that leverage PLMs. The primary components of this review include: (1) an introduction, (2) a systematic examination of PLMs, (3) deep learning-based text classification methodologies, (4) text classification approaches utilizing pre-trained models, (5) commonly used datasets and evaluation metrics in text classification, (6) prevalent challenges and emerging trends in the field, and (7) a conclusion.

引用

页数：16

共 50 条

[1] Pre-Trained Language Models for Text Generation: A Survey
Li, Junyi
Tang, Tianyi
Zhao, Wayne Xin
Nie, Jian-Yun
Wen, Ji-Rong
ACM COMPUTING SURVEYS, 2024, 56 (09)
[2] Better Few-Shot Text Classification with Pre-trained Language Model
Chen, Zheng
Zhang, Yunchen
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT II, 2021, 12892 : 537 - 548
[3] ZeroAE: Pre-trained Language Model based Autoencoder for Transductive Zero-shot Text Classification
Guo, Kaihao
Yu, Hang
Liao, Cong
Li, Jianguo
Zhang, Haipeng
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 3202 - 3219
[4] A Pre-Trained Language Model Based on LED for Tibetan Long Text Summarization
Ouyang, Xinpeng
Yan, Xiaodong
Hao, Minghui
PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 992 - 997
[5] A text restoration model for ancient texts based on pre-trained language model RoBERTa
Gu, Zhongyu
Guan, Yanzhi
Zhang, Shuai
PROCEEDINGS OF 2024 4TH INTERNATIONAL CONFERENCE ON INTERNET OF THINGS AND MACHINE LEARNING, IOTML 2024, 2024, : 96 - 102
[6] Electric Power Audit Text Classification With Multi-Grained Pre-Trained Language Model
Meng, Qinglin
Song, Yan
Mu, Jian
Lv, Yuanxu
Yang, Jiachen
Xu, Liang
Zhao, Jin
Ma, Junwei
Yao, Wei
Wang, Rui
Xiao, Maoxiang
Meng, Qingyu
IEEE ACCESS, 2023, 11 : 13510 - 13518
[7] Text data augmentation and pre-trained Language Model for enhancing text classification of low-resource languages
Ziyaden, Atabay
Yelenov, Amir
Hajiyev, Fuad
Rustamov, Samir
Pak, Alexandr
PEERJ COMPUTER SCIENCE, 2024, 10
[8] CLIP-Llama: A New Approach for Scene Text Recognition with a Pre-Trained Vision-Language Model and a Pre-Trained Language Model
Zhao, Xiaoqing
Xu, Miaomiao
Silamu, Wushour
Li, Yanbing
SENSORS, 2024, 24 (22)
[9] Question Answering based Clinical Text Structuring Using Pre-trained Language Model
Qiu, Jiahui
Zhou, Yangming
Ma, Zhiyuan
Ruan, Tong
Liu, Jinlin
Sun, Jing
2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 1596 - 1600
[10] A Survey of Controllable Text Generation Using Transformer-based Pre-trained Language Models
Zhang, Hanqing
Song, Haolin
Li, Shaoyu
Zhou, Ming
Song, Dawei
ACM COMPUTING SURVEYS, 2024, 56 (03)

← 1 2 3 4 5 →