Hierarchical text classification with multi-label contrastive learning and KNN

被引:11
作者
Zhang, Jun [1 ]
Li, Yubin [1 ]
Shen, Fanfan [2 ]
He, Yueshun [1 ]
Tan, Hai [2 ]
He, Yanxiang [3 ]
机构
[1] East China Univ Technol, Sch Informat Engn, Nanchang 330013, Peoples R China
[2] Nanjing Audit Univ, Sch Informat Engn, Nanjing 211815, Peoples R China
[3] Wuhan Univ, Comp Sch, Wuhan 430072, Peoples R China
基金
中国国家自然科学基金;
关键词
Hierarchical text classification; Label hierarchy; Multi -label contrastive learning; KNN;
D O I
10.1016/j.neucom.2024.127323
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Given the complicated label hierarchy, hierarchical text classification (HTC) has emerged as a challenging subtask in the realm of multi -label text classification. Existing methods enhance the quality of text representations by contrastive learning, but this supervised contrastive learning is designed for single -label setting and has two main limitations. On one hand, sample pairs with completely identical labels which should be treated as positive pairs are ignored. On the other hand, a simple pair is deemed as an absolutely positive or negative pair, which lacks consideration about the situation where sample pairs share some labels while having labels unique to each sample. Therefore, we propose a method combining multi -label contrastive learning with KNN (MLCL-KNN) for HTC. The proposed multi -label contrastive learning method can make text representations of sample pairs having more shared labels closer and separate those with no labels in common. During inference, we employ KNN to retrieve several neighbor samples and regard their labels as additional prediction, which is interpolated into the model output to further improve the performance of MLCL-KNN. Compared with the strongest baseline, MLCL-KNN achieves average improvements of 0.31%, 0.76%, 0.83%, and 0.43% on Micro -F1, Macro -F1, accuracy, and HiF respectively, which demonstrates its effectiveness.
引用
收藏
页数:13
相关论文
共 47 条
  • [1] Aly R, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, P323
  • [2] Alzantot M, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P2890
  • [3] Bai JW, 2022, PR MACH LEARN RES
  • [4] Banerjee S, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P6295
  • [5] Bhatia K, 2015, 29 ANN C NEURAL INFO, V28
  • [6] Cai L., 2004, P 13 ACM INT C INF K, P78, DOI 10.1145/1031171.1031186
  • [7] Chen HB, 2021, 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, P4370
  • [8] Chen JF, 2022, AAAI CONF ARTIF INTE, P10492
  • [9] NEAREST NEIGHBOR PATTERN CLASSIFICATION
    COVER, TM
    HART, PE
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) : 21 - +
  • [10] Deng ZF, 2021, 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), P3259