CoocNet: a novel approach to multi-label text classification with improved label co-occurrence modeling

被引：0

作者：

Li, Yi ^{[1
]}

Shen, Junge ^{[1
]}

Mao, Zhaoyong ^{[1
]}

机构：

[1] Northwestern Polytech Univ, Unmanned Syst Res Inst, Xian, Peoples R China

来源：

APPLIED INTELLIGENCE | 2024年 / 54卷 / 17-18期

基金：

中国国家自然科学基金;

关键词：

Multi-label text classification; Contrastive learning; Attention mechanism; Label correlation; BERT;

D O I：

10.1007/s10489-024-05379-0

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-label text classification (MLTC) aims to assign one or more labels to each document. Previous studies mainly use the label co-occurrence matrix obtained from the training set to establish the correlation between labels, but this approach ignores the noise in label co-occurrence, and applies the ungeneralizable label co-occurrence relationship to model testing and validation. In addition, labelling co-occurrence relationship globally lacks attention to a specific document, which results in the loss of the local label co-occurrence relationship. To address this issue, we introduced a new multi-label text classification model in this study, presenting CoocNet, which adopts a two-step label detection to effectively tackle the challenge of modeling label co-occurrence relations. The model first captures the global co-occurrence relationships of labels using the label co-occurrence matrix and denoises the label noise through the label denoising attention mechanism, and then uses a contrast learning strategy to capture the local label co-occurrence relationships among specific different documents. In particular, we unify the co-occurrence labeling into an auxiliary training task that runs parallel to the multi-label classification task. The new task supervises the learning of sentence representations for documents by leveraging the modeled label co-occurrence relationships, enhancing the model's generalization ability. Another novelty is that the auxiliary task is only active during model training, thereby preventing label co-occurrence relationships from interfering with the model's predictions outside the training phase. The experimental results on three benchmark datasets (Reuters-21578, AAPD, and RCV1) demonstrate that our model outperforms the existing state-of-the-art methods.

引用

页码：8702 / 8718

页数：17

共 48 条

[1] Adhikari A., 2019, DocBERT: BERT for Document Classi- fication
[2] Multi-label emotion classification in texts using transfer learning
Ameer, Iqra
Bolucu, Necva
Siddiqui, Muhammad Hammad Fahim
Can, Burcu
Sidorov, Grigori
Gelbukh, Alexander
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 213
[3] AUTOMATED LEARNING OF DECISION RULES FOR TEXT CATEGORIZATION
APTE, C
DAMERAU, F
WEISS, SM
[J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1994, 12 (03) : 233 - 251
[4] Cai L, 2020, IEEE ACCESS, V8, P192
[5] Text classification using embeddings: a survey
da Costa, Liliane Soares
Oliveira, Italo L.
Fileto, Renato
[J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 65 (07) : 2761 - 2803
[6] ML-Net: multi-label classification of biomedical texts with deep neural networks
Du, Jingcheng
Chen, Qingyu
Peng, Yifan
Xiang, Yang
Tao, Cui
Lu, Zhiyong
[J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2019, 26 (11) : 1279 - 1285
[7] Gao TY, 2021, 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), P6894
[8] Hierarchical Graph Transformer-Based Deep Learning Model for Large-Scale Multi-Label Text Classification
Gong, Jibing
Teng, Zhiyong
Teng, Qi
Zhang, Hekai
Du, Linfeng
Chen, Shuai
Bhuiyan, Md Zakirul Alam
Li, Jianhua
Liu, Mingsheng
Ma, Hongyuan
[J]. IEEE ACCESS, 2020, 8 : 30885 - 30896
[9] Gunel B, 2020, ARXIV
[10] CRAN: A Hybrid CNN-RNN Attention-Based Model for Text Classification
Guo, Long
Zhang, Dongxiang
Wang, Lei
Wang, Han
Cui, Bin
[J]. CONCEPTUAL MODELING, ER 2018, 2018, 11157 : 571 - 585

← 1 2 3 4 5 →