Counterfactual Adversarial Learning with Representation Interpolation

被引:0
作者
Wang, Wei [1 ]
Wang, Boxin [2 ]
Shi, Ning [1 ]
Li, Jinfeng [1 ]
Zhu, Bingyu [1 ]
Liu, Xiangyu [1 ]
Zhang, Rong [1 ]
机构
[1] Alibaba Grp, Hangzhou, Zhejiang, Peoples R China
[2] Univ Illinois, Champaign, IL USA
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021 | 2021年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning models exhibit a preference for statistical fitting over logical reasoning. Spurious correlations might be memorized when there exists statistical bias in training data, which severely limits the model performance especially in small data scenarios. In this work, we introduce Counterfactual Adversarial Training framework (CAT) to tackle the problem from a causality perspective. Particularly, for a specific sample, CAT first generates a counterfactual representation through latent space interpolation in an adversarial manner, and then performs Counterfactual Risk Minimization (CRM) on each original-counterfactual pair to adjust samplewise loss weight dynamically, which encourages the model to explore the true causal effect. Extensive experiments demonstrate that CAT achieves substantial performance improvement over SOTA across different downstream tasks, including sentence classification, natural language inference and question answering.(1)
引用
收藏
页码:4809 / 4820
页数:12
相关论文
共 41 条
[1]  
Berthelot D, 2019, ADV NEUR IN, V32
[2]  
Bottou L, 2013, J MACH LEARN RES, V14, P3207
[3]  
Bowman Samuel R., 2015, P 2015 C EMP METH NA, P632, DOI 10.18653/v1/D15-1075
[4]  
Chang M-W., 2008, Aaai, V2, P830
[5]  
Chen JA, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), P2147
[6]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[7]   Data Augmentation for Low-Resource Neural Machine Translation [J].
Fadaee, Marzieh ;
Bisazza, Arianna ;
Monz, Christof .
PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 2, 2017, :567-573
[8]  
Glasserman P., 2004, Monte Carlo Methods in Financing Engineering
[9]  
Jung Yonghan, 2020, Advances in Neural Information Processing Systems, V33
[10]  
Kaushik D., 2019, INT C LEARN REPR