Accurate use of label dependency in multi-label text classification through the lens of causality

被引:3
作者
Fan, Caoyun [1 ]
Chen, Wenqing [2 ]
Tian, Jidong [1 ]
Li, Yitian [1 ]
He, Hao [1 ]
Jin, Yaohui [1 ]
机构
[1] Shanghai Jiao Tong Univ, AI Inst, MoE Key Lab Artificial Intelligence, Shanghai, Peoples R China
[2] Sun Yat Sen Univ, Sch Software Engn, Guangzhou, Peoples R China
关键词
Multi-label text classification; Label dependency; Correlation shortcut; Counterfactual de-bias;
D O I
10.1007/s10489-023-04623-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-Label Text Classifiction (MLTC) aims to assign the most relevant labels to each given text. Existing methods demonstrate that label dependency can help to improve the model's performance. However, the introduction of label dependency may cause the model to suffer from unwanted prediction bias. In this study, we attribute the bias to the model's misuse of label dependency, i.e., the model tends to utilize the correlation shortcut in label dependency rather than fusing text information and label dependency for prediction. Motivated by causal inference, we propose a CounterFactual Text Classifier (CFTC) to eliminate the correlation bias, and make causality-based predictions. Specifically, our CFTC first adopts the predict-then-modify backbone to extract precise label information embedded in label dependency, then blocks the correlation shortcut through the counterfactual de-bias technique with the help of the human causal graph. Experimental results on three datasets demonstrate that our CFTC significantly outperforms the baselines and effectively eliminates the correlation bias in datasets.
引用
收藏
页码:21841 / 21857
页数:17
相关论文
共 53 条
[1]   A survey of state-of-the-art approaches for emotion recognition in text [J].
Alswaidan, Nourah ;
Menai, Mohamed El Bachir .
KNOWLEDGE AND INFORMATION SYSTEMS, 2020, 62 (08) :2937-2987
[2]   Learning multi-label scene classification [J].
Boutell, MR ;
Luo, JB ;
Shen, XP ;
Brown, CM .
PATTERN RECOGNITION, 2004, 37 (09) :1757-1771
[3]  
Chen BL, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), P3115
[4]  
Chen GB, 2017, IEEE IJCNN, P2377, DOI 10.1109/IJCNN.2017.7966144
[5]  
Chen HB, 2021, 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, P4370
[6]  
Chen H, 2021, 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), P269
[7]   Multi-Label Image Recognition with Graph Convolutional Networks [J].
Chen, Zhao-Min ;
Wei, Xiu-Shen ;
Wang, Peng ;
Guo, Yanwen .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5172-5181
[8]   Multi-label text classification with latent word-wise label information [J].
Chen, Ziheng ;
Ren, Jiangtao .
APPLIED INTELLIGENCE, 2021, 51 (02) :966-979
[9]   Sufficient dimension reduction for average causal effect estimation [J].
Cheng, Debo ;
Li, Jiuyong ;
Liu, Lin ;
Thuc Duy Le ;
Liu, Jixue ;
Yu, Kui .
DATA MINING AND KNOWLEDGE DISCOVERY, 2022, 36 (03) :1174-1196
[10]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171