Prompt tuning discriminative language models for hierarchical text classification

被引：0

作者：

du Toit, Jaco ^{[1
,2
]}

Dunaiski, Marcel ^{[1
,2
]}

机构：

[1] Stellenbosch Univ, Dept Math Sci, Comp Sci Div, Stellenbosch, South Africa

[2] Stellenbosch Univ, Sch Data Sci & Computat Thinking, Stellenbosch, South Africa

来源：

NATURAL LANGUAGE PROCESSING | 2024年

关键词：

Large language models; discriminative language models; hierarchical text classification; prompt tuning;

D O I：

10.1017/nlp.2024.51

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Hierarchical text classification (HTC) is a natural language processing task which aims to categorise a text document into a set of classes from a hierarchical class structure. Recent approaches to solve HTC tasks focus on leveraging pre-trained language models (PLMs) and the hierarchical class structure by allowing these components to interact in various ways. Specifically, the Hierarchy-aware Prompt Tuning (HPT) method has proven to be effective in applying the prompt tuning paradigm to Bidirectional Encoder Representations from Transformers (BERT) models for HTC tasks. Prompt tuning aims to reduce the gap between the pre-training and fine-tuning phases by transforming the downstream task into the pre-training task of the PLM. Discriminative PLMs, which use a replaced token detection (RTD) pre-training task, have also shown to perform better on flat text classification tasks when using prompt tuning instead of vanilla fine-tuning. In this paper, we propose the Hierarchy-aware Prompt Tuning for Discriminative PLMs (HPTD) approach which injects the HTC task into the RTD task used to pre-train discriminative PLMs. Furthermore, we make several improvements to the prompt tuning approach of discriminative PLMs that enable HTC tasks to scale to much larger hierarchical class structures. Through comprehensive experiments, we show that our method is robust and outperforms current state-of-the-art approaches on two out of three HTC benchmark datasets.

引用

页数：18

共 50 条

[41] Text Classification with Imperfect Hierarchical Structure Knowledge
Ngo-Ye, Thomas
Dutt, Abhijit
AMCIS 2010 PROCEEDINGS, 2010,
[42] Disentangled feature graph for Hierarchical Text Classification
Liu, Renyuan
Zhang, Xuejie
Wang, Jin
Zhou, Xiaobing
INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (03)
[43] HTCSI: A Hierarchical Text Classification Method Based on Selection-Inference
Xu, Yiming
Feng, Jianzhou
Gu, Chenghan
Qin, Haonan
Xue, Kehan
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT II, NLPCC 2024, 2025, 15360 : 307 - 318
[44] BEYOND SIMPLE TEXT STYLE TRANSFER: UNVEILING COMPOUND TEXT STYLE TRANSFER WITH PROMPT-BASED PRE-TRAINED LANGUAGE MODELS
Ju, Shuai
Wang, Chenxu
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 6850 - 6854
[45] JumpLiteGCN: A Lightweight Approach to Hierarchical Text Classification
Liu, Teng
Liu, Xiangzhi
Dong, Yunfeng
Wu, Xiaoming
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT IV, NLPCC 2024, 2025, 15362 : 54 - 66
[46] Hierarchical Prompt Tuning for Few-Shot Multi-Task Learning
Liu, Jingping
Chen, Tao
Liang, Zujie
Jiang, Haiyun
Xiao, Yanghua
Wei, Feng
Qian, Yuxi
Hao, Zhenghong
Han, Bing
PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 1556 - 1565
[47] SimEmotion: A Simple Knowledgeable Prompt Tuning Method for Image Emotion Classification
Deng, Sinuo
Shi, Ge
Wu, Lifang
Xing, Lehao
Hu, Wenjin
Zhang, Heng
Xiang, Ye
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2022, PT III, 2022, : 222 - 229
[48] SDPT: Synchronous Dual Prompt Tuning for Fusion-Based Visual-Language Pre-trained Models
Zhou, Yang
Wu, Yongjian
Saiyin, Jiya
Wei, Bingzheng
Lai, Maode
Chang, Eric
Xu, Yan
COMPUTER VISION - ECCV 2024, PT XLIX, 2025, 15107 : 340 - 356
[49] Leveraging large language models for medical text classification: a hospital readmission prediction case
Nazyrova, Nodira
Chahed, Salma
Chausalet, Thierry
Dwek, Miriam
2024 14TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION SYSTEMS, ICPRS, 2024,
[50] To prompt or not to prompt: Navigating the use of Large Language Models for integrating and modeling heterogeneous data
Remadi, Adel
El Hage, Karim
Hobeika, Yasmina
Bugiotti, Francesca
DATA & KNOWLEDGE ENGINEERING, 2024, 152

← 1 2 3 4 5 →