HeteroHTC: Enhancing Hierarchical Text Classification via Heterogeneity Encoding of Label Hierarchy

被引:1
作者
Song, Junru [1 ]
Chen, Tianlei [3 ]
Yang, Yang [4 ]
Wang, Feifei [2 ,3 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai 200240, Peoples R China
[2] Renmin Univ China, Ctr Appl Stat, Beijing 100872, Peoples R China
[3] Renmin Univ China, Sch Stat, Beijing 100872, Peoples R China
[4] Peking Univ, Sch Comp Sci, Beijing 100072, Peoples R China
关键词
Hierarchical Text Classification; Heterogeneous Graph Transformer; Large Language Models;
D O I
10.1016/j.eswa.2025.126558
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hierarchical Text Classification (HTC) is a challenging subtask of multi-label text classification, where labels are organized into a pre-defined hierarchy. Recent works primarily encode documents and labels separately before cross attention-based feature extraction, and in the process, they collectively overlook a crucial characteristic of label hierarchies: "heterogeneity". Specifically, labels on different levels hold different granularities, and they should be projected onto distinct feature spaces; The relationships among labels are various, dictating that the message transmission among them should occur in unique feature spaces. We term these properties ''granularity heterogeneity"and "relationship heterogeneity", respectively. To fully exploit these ubiquitous yet overlooked properties, we propose HeteroHTC, which features a heterogeneous label hierarchy encoder. Additionally, we leverage pre-trained Large Language Models (LLMs) to generate high-quality label descriptions with strategically designed prompts. HeteroHTC outperforms almost all baselines in our extensive experiments on three datasets, proving its effectiveness and the necessity to take "granularity and relationship heterogeneity"into consideration.
引用
收藏
页数:11
相关论文
共 44 条
[11]  
Gopal S, 2013, 19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), P257
[12]   HeteroMorpheus: Universal Control Based on Morphological Heterogeneity Modeling [J].
Hao, YiFan ;
Yang, Yang ;
Song, Junru ;
Peng, Wei ;
Zhou, Weien ;
Jiang, Tingsong ;
Yao, Wen .
2024 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN 2024, 2024,
[13]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[14]  
Hochreiter Sepp, 1997, Neural Computation, V9, P11
[15]   Heterogeneous Graph Transformer [J].
Hu, Ziniu ;
Dong, Yuxiao ;
Wang, Kuansan ;
Sun, Yizhou .
WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, :2704-2710
[16]  
Huang BC, 2024, Arxiv, DOI arXiv:2404.06290
[17]   Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network Approach [J].
Huang, Wei ;
Chen, Enhong ;
Liu, Qi ;
Chen, Yuying ;
Huang, Zai ;
Liu, Yang ;
Zhao, Zhou ;
Zhang, Dan ;
Wang, Shijin .
PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, :1051-1060
[18]   Pharmacophoric-constrained heterogeneous graph transformer model for molecular property prediction [J].
Jiang, Yinghui ;
Jin, Shuting ;
Jin, Xurui ;
Xiao, Xianglu ;
Wu, Wenfan ;
Liu, Xiangrong ;
Zhang, Qiang ;
Zeng, Xiangxiang ;
Yang, Guang ;
Niu, Zhangming .
COMMUNICATIONS CHEMISTRY, 2023, 6 (01)
[19]  
Kingsbury D, 2015, P1, DOI [DOI 10.1021/bk-2015-1214.ch001, 10.48550/arXiv.1312.6114]
[20]   HDLTex: Hierarchical Deep Learning for Text Classification [J].
Kowsari, Kamran ;
Brown, Donald E. ;
Heidarysafa, Mojtaba ;
Meimandi, Kiana Jafari ;
Gerber, Matthew S. ;
Barnes, Laura E. .
2017 16TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2017, :364-371