BBSN: Bilateral-Branch Siamese Network for Imbalanced Multi-label Text Classification

被引:0
作者
Zhao, Jiangjiang [1 ,2 ]
Li, Jiyz [2 ]
Fukumoto, Fumiyo [2 ]
机构
[1] Hangzhou Dianzi Univ, Hangzhou, Peoples R China
[2] Univ Yamanashi, Kofu, Yamanashi, Japan
来源
NEURAL INFORMATION PROCESSING, ICONIP 2022, PT III | 2023年 / 13625卷
关键词
Multi-label Text Classification; Imbalanced Data;
D O I
10.1007/978-3-031-30111-7_33
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In multi-label text classification, the numbers of instances in different categories are usually extremely imbalanced. How to learn good models from imbalanced data is a challenging task. Some existing works tackle it through class re-balancing strategies or imbalanced loss objectives, but their performance remains limited in the cases of imbalanced distributed data. In this work, we propose a model, which combined Siamese Network and Bilateral-Branch Network to deal with both representation learning and classifier learning simultaneously. In the siamese network component, we propose a category-specific similarity strategy to improve the representation learning and adapt a novelty dynamic learning mechanism to make the model end-to-end trainable, and in the bilateral-branch network, we adopt the cumulative learning strategy to shift the learning focus from universal pattern to tail learning. In general, we adopt a multi-task architecture to ensure that both the head categories and the tail categories are adequately trained. The experiments on two benchmark datasets show that our method can improve the performance on the entire and tail categories, and achieves competitive performance compared with existing approaches.
引用
收藏
页码:384 / 396
页数:13
相关论文
共 27 条
[1]  
Bromley J., 1993, International Journal of Pattern Recognition and Artificial Intelligence, V7, P669, DOI 10.1142/S0218001493000339
[2]   Feature Space Augmentation for Long-Tailed Data [J].
Chu, Peng ;
Bian, Xiao ;
Liu, Shaopeng ;
Ling, Haibin .
COMPUTER VISION - ECCV 2020, PT XXIX, 2020, 12374 :694-710
[3]   Class-Balanced Loss Based on Effective Number of Samples [J].
Cui, Yin ;
Jia, Menglin ;
Lin, Tsung-Yi ;
Song, Yang ;
Belongie, Serge .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :9260-9269
[4]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[5]   Long-Tailed Multi-Label Visual Recognition by Collaborative Training on Uniform and Re-balanced Samplings [J].
Guo, Hao ;
Wang, Song .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :15084-15093
[6]   A Multi-task based Bilateral-Branch Network for Imbalanced Citation Intent Classification [J].
Hu, Tianxiang ;
Li, Jiyi ;
Fukumoto, Fumiyo ;
Zhou, Renjie .
PROCEEDINGS OF THE 2022 16TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM 2022), 2022,
[7]  
Huang Y, 2021, 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), P8153
[8]  
Lewis DD, 2004, J MACH LEARN RES, V5, P361
[9]   Multi-task Neural Shared Structure Search: A Study Based on Text Mining [J].
Li, Jiyi ;
Fukumoto, Fumiyo .
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2021), PT II, 2021, 12682 :202-218
[10]  
Li Jiyi, 2020, P 1 WORKSH SCHOL DOC, P121