GACaps-HTC: graph attention capsule network for hierarchical text classification

被引:9
作者
Bang, Jinhyun [1 ,2 ]
Park, Jonghun [1 ,2 ]
Park, Jonghyuk [3 ]
机构
[1] Seoul Natl Univ, Dept Ind Engn, 1 Gwanak ro, Seoul 08826, South Korea
[2] Seoul Natl Univ, Inst Ind Syst Innovat, 1 Gwanak ro, Seoul 08826, South Korea
[3] Kookmin Univ, Dept AI Big Data & Management, 77 Jungnung ro, Seoul 02707, South Korea
基金
新加坡国家研究基金会;
关键词
Hierarchical text classification; Graph neural network; Capsule network; Attention mechanism; Natural language processing;
D O I
10.1007/s10489-023-04585-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hierarchical text classification has been receiving increasing attention due to its vast range of applications in real-world natural language processing tasks. While previous approaches have focused on effectively exploiting the label hierarchy for classification or capturing latent label relationships, few studies have integrated these concepts. In this work, we propose a graph attention capsule network for hierarchical text classification (GACaps-HTC), designed to capture both the explicit hierarchy and implicit relationships of labels. A graph attention network is employed to incorporate the information on the label hierarchy into a textual representation, whereas a capsule network infers classification probabilities by understanding the latent label relationships via iterative updates. The proposed approach is optimized using a loss term designed to address the innate label imbalance issue of the task. Experiments were conducted on two widely used text classification datasets, the WOS-46985 dataset and the RCV1 dataset. The results reveal that the proposed approach achieved a 0.6% gain and a 2.0% gain in micro-F1 and macro-F1 scores, respectively, on the WOS-46985 dataset and a 0.3% gain and a 2.2% gain in micro-F1 and macro-F1 scores, respectively, on the RCV1 dataset compared to the previous state-of-the-art approaches. Further ablation studies show that each component in GACaps-HTC played a part in enhancing the classification performance.
引用
收藏
页码:20577 / 20594
页数:18
相关论文
共 87 条
[1]   Optimizing Neural Networks for Patent Classification [J].
Abdelgawad, Louay ;
Kluegl, Peter ;
Genc, Erdan ;
Falkner, Stefan ;
Hutter, Frank .
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT III, 2020, 11908 :688-703
[2]  
Abuselidze G., 2019, Academy of Strategic Management Journal, V18, P1
[3]  
Aly R, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, P323
[4]  
Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, 10.48550/arXiv.1409.0473, DOI 10.48550/ARXIV.1409.0473]
[5]  
Banerjee S, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P6295
[6]  
Beltagy I, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P3615
[7]  
Bruna Joan, 2014, 2 INT C LEARN REPR I, P1
[8]  
Chai D, 2020, PR MACH LEARN RES, V119
[9]  
Chatterjee S, 2021, 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), P2829
[10]  
Chen BL, 2020, AAAI CONF ARTIF INTE, V34, P7496