ToSA: A Top-Down Tree Structure Awareness Model for Hierarchical Text Classification

被引:0
作者
Zhao, Deji [1 ]
Ning, Bo [1 ]
Song, Shuangyong [2 ]
Wang, Chao [2 ]
Chen, Xiangyan [2 ]
Yu, Xiaoguang [2 ]
Zou, Bo [2 ]
机构
[1] Dalian Maritime Univ, Sch Informat Sci & Technol, Dalian, Peoples R China
[2] JD AI Res, Beijing, Peoples R China
来源
WEB AND BIG DATA, PT II, APWEB-WAIM 2022 | 2023年 / 13422卷
关键词
Hierarchical multi-label text classification; Graph embedding; Text generation;
D O I
10.1007/978-3-031-25198-6_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hierarchical text classification (HTC) is a challenging task that classifies textual descriptions with a taxonomic hierarchy. Existing methods have difficulties in modeling the hierarchical label structure. They focus on using the graph embedding methods to encode the hierarchical structure, ignoring that the HTC labels are based on a tree structure. There is a difference between tree and graph structure: in the graph structure, message passing is undirected, which will lead to the imbalance of message transmission between nodes when applied to HTC. As the nodes in different layers have inheritance relationships, the information transmission between nodes should be directional and hierarchical in the HTC task. In this paper, we propose a Top-Down Tree Structure Awareness Model to extract the hierarchical structure features, called ToSA. We regard HTC as a sequence generation task and introduce a priori hierarchical information in the decoding process. We block the information flow in one direction to ensure the graph embedding method is more suitable for the HTC task, then get the enhanced tree structure representation. Experiment results show that our model can achieve the best results on both the public WOS dataset and a collected E-commerce user intent classification dataset(3)
引用
收藏
页码:23 / 37
页数:15
相关论文
empty
未找到相关数据