Efficient fine-tuning of short text classification based on large language model

被引:0
作者
Wang, Likun [1 ]
机构
[1] Yunnan Univ, Sch Informat Sci & Engn, Kunming 650500, Yunnan, Peoples R China
来源
PROCEEDINGS OF INTERNATIONAL CONFERENCE ON MODELING, NATURAL LANGUAGE PROCESSING AND MACHINE LEARNING, CMNM 2024 | 2024年
关键词
short text classification; large language model; LLaMA; LoRA finetuning;
D O I
10.1145/3677779.3677785
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the booming development of social networks, a massive amount of short texts emerge every day, containing valuable information such as user interests and intentions. Therefore, the mining and classification of short text information is particularly important. However, the inherent characteristics of sparse features and high noise in short texts limit the performance of traditional machine learning methods in short text classification. Meanwhile, many neural network models often rely on a large amount of annotated data during the training process, but obtaining sufficient annotated data is a challenging task in practical situations. Taking inspiration from recent large-scale language models, this article proposes an efficient fine-tuning method for short text classification based on the LLaMA large-scale language model. Utilizing the powerful learning ability of large language models to expand text information, fine-tuning the freezing model and instruction learning through LoRA can more fully classify downstream specific tasks. From the experimental results obtained from real datasets, it can be seen that the method proposed in this paper has achieved an improvement in the accuracy of short text classification.
引用
收藏
页码:33 / 38
页数:6
相关论文
共 15 条
  • [1] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
  • [2] Brown TB, 2020, ADV NEUR IN, V33
  • [3] Chowdhery A, 2023, J MACH LEARN RES, V24
  • [4] Da Silva L, 2024, Arxiv, DOI arXiv:2402.08801
  • [5] Gao K, 2018, IEEE IJCNN
  • [6] Hoffmann J, 2022, ADV NEUR IN
  • [7] Deep Pyramid Convolutional Neural Networks for Text Categorization
    Johnson, Rie
    Zhang, Tong
    [J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 562 - 570
  • [8] Kim BK, 2024, Arxiv, DOI arXiv:2402.02834
  • [9] A Modified Intuitionistic Fuzzy Clustering Algorithm for Medical Image Segmentation
    Kumar, S. V. Aruna
    Harish, B. S.
    [J]. JOURNAL OF INTELLIGENT SYSTEMS, 2018, 27 (04) : 593 - 607
  • [10] Liu WS, 2016, 2016 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), P1195, DOI 10.1109/CompComm.2016.7924894