HiGen: Hierarchy-Aware Sequence Generation for Hierarchical Text Classification

被引:0
|
作者
Jain, Vidit [1 ]
Rungta, Mukund [1 ,3 ]
Zhuang, Yuchen [1 ]
Yu, Yue [1 ]
Wang, Zeyu [2 ]
Gao, Mu [1 ]
Skolnick, Jeffrey [1 ]
Zhang, Chao [1 ]
机构
[1] Georgia Inst Technol, Atlanta, GA 30332 USA
[2] Yale Univ, New Haven, CT USA
[3] Microsoft, Cambridge, MA USA
来源
PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS | 2024年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hierarchical text classification (HTC) is a complex subtask under multi-label text classification, characterized by a hierarchical label taxonomy and data imbalance. The best-performing models aim to learn a static representation by combining document and hierarchical label information. However, the relevance of document sections can vary based on the hierarchy level, necessitating a dynamic document representation. To address this, we propose HiGen, a text-generation-based framework utilizing language models to encode dynamic text representations. We introduce a level-guided loss function to capture the relationship between text and label name semantics. Our approach incorporates a task-specific pretraining strategy, adapting the language model to in-domain knowledge and significantly enhancing performance for classes with limited examples. Furthermore, we present a new and valuable dataset called ENZYME, designed for HTC, which comprises articles from PubMed with the goal of predicting Enzyme Commission (EC) numbers. Through extensive experiments on the ENZYME dataset and the widely recognized WOS and NYT datasets, our methodology demonstrates superior performance, surpassing existing approaches while efficiently handling data and mitigating class imbalance. We release our code and dataset here: https://github.com/viditjain99/HiGen.
引用
收藏
页码:1354 / 1368
页数:15
相关论文
共 50 条
  • [1] HiGen: Hierarchy-Aware Sequence Generation for Hierarchical Text Classification
    Jain, Vidit
    Rungta, Mukund
    Zhuang, Yuchen
    Yu, Yue
    Wang, Zeyu
    Gao, Mu
    Skolnick, Jeffrey
    Zhang, Chao
    EACL 2024 - 18th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference, 2024, 1 : 1354 - 1368
  • [2] Hierarchy-Aware Global Model for Hierarchical Text Classification
    Zhou, Jie
    Ma, Chunping
    Long, Dingkun
    Xu, Guangwei
    Ding, Ning
    Zhang, Haoyu
    Xie, Pengjun
    Liu, Gongshen
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1106 - 1117
  • [3] Hierarchy-Aware and Label Balanced Model for Hierarchical Text Classification
    Zhang, Jun
    Li, Yubin
    Shen, Fanfan
    Xia, Chenxi
    Tan, Hai
    He, Yanxiang
    KNOWLEDGE-BASED SYSTEMS, 2024, 300
  • [4] Local Hierarchy-Aware Text-Label Association for Hierarchical Text Classification
    Kumar, Ashish
    Toshniwal, Durga
    2024 IEEE 11TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS, DSAA 2024, 2024, : 68 - 77
  • [5] HiTIN: Hierarchy-aware Tree Isomorphism Network for Hierarchical Text Classification
    Zhu, He
    Zhang, Chong
    Huang, Junjie
    Wu, Junran
    Xu, Ke
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 7809 - 7821
  • [6] Hierarchy-aware Label Semantics Matching Network for Hierarchical Text Classification
    Chen, Haibin
    Ma, Qianli
    Lin, Zhenxi
    Yan, Jiangyue
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 4370 - 4379
  • [7] Hierarchy-Aware Bilateral-Branch Network for Imbalanced Hierarchical Text Classification
    Zhao, Jiangjiang
    Lie, Jiyi
    Fukumoto, Fumiyo
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2023, PT II, 2023, 14147 : 143 - 157
  • [8] Hierarchical Text Classification as Sub-hierarchy Sequence Generation
    Im, SangHun
    Kim, GiBaeg
    Oh, Heung-Seon
    Jo, Seongung
    Kim, Dong Hwan
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 12933 - 12941
  • [9] Instances and Labels: Hierarchy-aware Joint Supervised Contrastive Learning for Hierarchical Multi-Label Text Classification
    Lok, Simon Chi U.
    He, Jie
    Gutierrez-Basulto, Victor
    Pan, Jeff Z.
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 8858 - 8875
  • [10] A Hierarchy-Aware Approach to the Multiaspect Text Categorization Problem
    Zadrozny, Slawomir
    Kacprzyk, Janusz
    Gajewski, Marek
    RECENT DEVELOPMENTS AND THE NEW DIRECTION IN SOFT-COMPUTING FOUNDATIONS AND APPLICATIONS, 2018, 361 : 49 - 62