Improving Few-Shot Named Entity Recognition with Causal Interventions

被引:1
作者
Yang, Zhen [1 ,2 ]
Liu, Yongbin [1 ]
Ouyang, Chunping [1 ]
Zhao, Shu [2 ]
Zhu, Chi [1 ]
机构
[1] Univ South China, Sch Comp, Hengyang 421000, Peoples R China
[2] Anhui Univ, Sch Comp Sci & Technol, Hefei 230601, Peoples R China
来源
BIG DATA MINING AND ANALYTICS | 2024年 / 7卷 / 04期
基金
中国国家自然科学基金;
关键词
Prototypes; Correlation; Overfitting; Animals; Vectors; Named entity recognition; Contrastive learning; Accuracy; Semantics; Few shot learning; few-shot Named Entity Recognition (NER); causal inference; prototypical network;
D O I
10.26599/BDMA.2024.9020052
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-shot Named Entity Recognition (NER) systems are designed to identify new categories of entities with a limited number of labeled examples. A major challenge encountered by these systems is overfitting, particularly pronounced in comparison to tasks with ample samples. This overfitting predominantly stems from spurious correlations, a consequence of biases inherent in the selection of a small sample set. In response to this challenge, we introduce a novel approach in this paper: a causal intervention-based method for few-shot NER. Building upon the foundational structure of prototypical networks, our method strategically intervenes in the context to obstruct the indirect association between the context and the label. For scenarios restricted to 1-shot, where contextual intervention is not feasible, our method utilizes incremental learning to intervene at the prototype level. This not only counters overfitting but also aids in alleviating catastrophic forgetting. Additionally, to preliminarily classify entity types, we employ entity detection methods for coarse categorization. Considering the distinct characteristics of the source and target domains in few-shot tasks, we introduce sample reweighting to aid in model transfer and generalization. Through rigorous testing across multiple benchmark datasets, our approach consistently sets new state-of-the-art benchmarks, underscoring its efficacy in few-shot NER applications.
引用
收藏
页码:1421 / 1421
页数:1
相关论文
共 59 条
[1]   Semi-Supervised Machine Learning for Fault Detection and Diagnosis of a Rooftop Unit [J].
Albayati, Mohammed G. G. ;
Faraj, Jalal ;
Thompson, Amy ;
Patil, Prathamesh ;
Gorthala, Ravi ;
Rajasekaran, Sanguthevar .
BIG DATA MINING AND ANALYTICS, 2023, 6 (02) :170-184
[2]  
Athiwaratkun B, 2020, Arxiv, DOI arXiv:2009.13272
[3]  
Bogdanov S, 2024, Arxiv, DOI [arXiv:2402.15343, 10.48550/arXiv.2402.15343]
[4]  
Chen MY, 2019, Arxiv, DOI arXiv:1909.01515
[5]   HEProto: A Hierarchical Enhancing ProtoNet based on Multi-Task Learning for Few-shot Named Entity Recognition [J].
Chen, Wei ;
Zhao, Lili ;
Luo, Pengfei ;
Xu, Tong ;
Zheng, Yi ;
Chen, Enhong .
PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, :296-305
[6]  
Chiu J.P., 2016, T ASS COMPUTATIONAL, V4, P357, DOI [DOI 10.1162/TACL_A_00104, DOI 10.1162/TACLA00104, 10.1162/tacl_a_00104]
[7]  
Coucke A, 2018, Arxiv, DOI arXiv:1805.10190
[8]  
Cui LY, 2021, Arxiv, DOI arXiv:2106.01760
[9]  
Das S.S.S., 2022, arXiv
[10]  
d'Autume CD, 2019, Arxiv, DOI arXiv:1906.01076