Prompt tuning discriminative language models for hierarchical text classification

被引:0
|
作者
du Toit, Jaco [1 ,2 ]
Dunaiski, Marcel [1 ,2 ]
机构
[1] Stellenbosch Univ, Dept Math Sci, Comp Sci Div, Stellenbosch, South Africa
[2] Stellenbosch Univ, Sch Data Sci & Computat Thinking, Stellenbosch, South Africa
来源
NATURAL LANGUAGE PROCESSING | 2024年
关键词
Large language models; discriminative language models; hierarchical text classification; prompt tuning;
D O I
10.1017/nlp.2024.51
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hierarchical text classification (HTC) is a natural language processing task which aims to categorise a text document into a set of classes from a hierarchical class structure. Recent approaches to solve HTC tasks focus on leveraging pre-trained language models (PLMs) and the hierarchical class structure by allowing these components to interact in various ways. Specifically, the Hierarchy-aware Prompt Tuning (HPT) method has proven to be effective in applying the prompt tuning paradigm to Bidirectional Encoder Representations from Transformers (BERT) models for HTC tasks. Prompt tuning aims to reduce the gap between the pre-training and fine-tuning phases by transforming the downstream task into the pre-training task of the PLM. Discriminative PLMs, which use a replaced token detection (RTD) pre-training task, have also shown to perform better on flat text classification tasks when using prompt tuning instead of vanilla fine-tuning. In this paper, we propose the Hierarchy-aware Prompt Tuning for Discriminative PLMs (HPTD) approach which injects the HTC task into the RTD task used to pre-train discriminative PLMs. Furthermore, we make several improvements to the prompt tuning approach of discriminative PLMs that enable HTC tasks to scale to much larger hierarchical class structures. Through comprehensive experiments, we show that our method is robust and outperforms current state-of-the-art approaches on two out of three HTC benchmark datasets.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] Text Classification with Imperfect Hierarchical Structure Knowledge
    Ngo-Ye, Thomas
    Dutt, Abhijit
    AMCIS 2010 PROCEEDINGS, 2010,
  • [42] Disentangled feature graph for Hierarchical Text Classification
    Liu, Renyuan
    Zhang, Xuejie
    Wang, Jin
    Zhou, Xiaobing
    INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (03)
  • [43] HTCSI: A Hierarchical Text Classification Method Based on Selection-Inference
    Xu, Yiming
    Feng, Jianzhou
    Gu, Chenghan
    Qin, Haonan
    Xue, Kehan
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT II, NLPCC 2024, 2025, 15360 : 307 - 318
  • [44] BEYOND SIMPLE TEXT STYLE TRANSFER: UNVEILING COMPOUND TEXT STYLE TRANSFER WITH PROMPT-BASED PRE-TRAINED LANGUAGE MODELS
    Ju, Shuai
    Wang, Chenxu
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 6850 - 6854
  • [45] JumpLiteGCN: A Lightweight Approach to Hierarchical Text Classification
    Liu, Teng
    Liu, Xiangzhi
    Dong, Yunfeng
    Wu, Xiaoming
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT IV, NLPCC 2024, 2025, 15362 : 54 - 66
  • [46] Hierarchical Prompt Tuning for Few-Shot Multi-Task Learning
    Liu, Jingping
    Chen, Tao
    Liang, Zujie
    Jiang, Haiyun
    Xiao, Yanghua
    Wei, Feng
    Qian, Yuxi
    Hao, Zhenghong
    Han, Bing
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 1556 - 1565
  • [47] SimEmotion: A Simple Knowledgeable Prompt Tuning Method for Image Emotion Classification
    Deng, Sinuo
    Shi, Ge
    Wu, Lifang
    Xing, Lehao
    Hu, Wenjin
    Zhang, Heng
    Xiang, Ye
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2022, PT III, 2022, : 222 - 229
  • [48] SDPT: Synchronous Dual Prompt Tuning for Fusion-Based Visual-Language Pre-trained Models
    Zhou, Yang
    Wu, Yongjian
    Saiyin, Jiya
    Wei, Bingzheng
    Lai, Maode
    Chang, Eric
    Xu, Yan
    COMPUTER VISION - ECCV 2024, PT XLIX, 2025, 15107 : 340 - 356
  • [49] Leveraging large language models for medical text classification: a hospital readmission prediction case
    Nazyrova, Nodira
    Chahed, Salma
    Chausalet, Thierry
    Dwek, Miriam
    2024 14TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION SYSTEMS, ICPRS, 2024,
  • [50] To prompt or not to prompt: Navigating the use of Large Language Models for integrating and modeling heterogeneous data
    Remadi, Adel
    El Hage, Karim
    Hobeika, Yasmina
    Bugiotti, Francesca
    DATA & KNOWLEDGE ENGINEERING, 2024, 152