Logit Adjustment with Normalization and Augmentation in Few-Shot Named Entity Recognition

被引：0

作者：

Zhang, Jinglei ^{[1
,2
]}

Wen, Guochang ^{[1
,2
]}

Liao, NingLin ^{[4
]}

Du, DongDong ^{[3
]}

Gao, Qing ^{[1
]}

Zhang, Minghui ^{[5
]}

Cao, XiXin ^{[2
]}

机构：

[1] Peking Univ, Natl Engn Res Ctr Software Engn, Beijing, Peoples R China

[2] Peking Univ, Sch Software & Microelect, Beijing, Peoples R China

[3] China Acad Ind Internet, Beijing, Peoples R China

[4] Beijing Inst Control & Elect Technol, Beijing, Peoples R China

[5] Peking Univ, Handan Inst Innovat, Handan, Hebei, Peoples R China

来源：

KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT III, KSEM 2024 | 2024年 / 14886卷

关键词：

Natural Language Processing; Information Extraction; Named entity recognition; logit adjustment; representation augmentation;

D O I：

10.1007/978-981-97-5498-4_31

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We study the problem of few-shot learning in Name Entity Recognition(FS-NER). Specifically, unlike other sequence labeling-based models, that mainly focus on better representations, we leverage logit adjustment technology to alleviate the problem that the different distribution between training and test dataset. Furthermore, we propose a simple but effective method, called Logit Adjustment with Normalization and Augmentation (LANA), for FS-NER. In detail, LANA first combines moving average and logit adjustment to retain the information of pre-training to overcome the representation drop problem in FS-NER. We also involve logit normalization to deal with the overfitting problem in FS-NER, and further improve the generalization ability of LANA. Our method achieves competitive performance on seven widely used FS-NER datasets and significantly reduces the influence of overfitting and representation drop.

引用

页码：398 / 410

页数：13

共 33 条

[1] A SURVEY OF TECHNIQUES FOR EVENT DETECTION IN TWITTER [J].

Atefeh, Farzindar ;

Khreich, Wael .

COMPUTATIONAL INTELLIGENCE, 2015, 31 (01) :132-164

[2]

Bach N., 2007, Literature review for Language and Statistics II, V2, P1

[3]

Brown TB, 2020, ADV NEUR IN, V33

[4]

Budzianowski P, 2020, Arxiv, DOI arXiv:1810.00278

[5]

Chen C, 2022, AAAI CONF ARTIF INTE, P212

[6]

Chen JF, 2022, AAAI CONF ARTIF INTE, P10492

[7]

Coucke A, 2018, Arxiv, DOI arXiv:1805.10190

[8]

Das S.S.S., 2022, arXiv

[9]

Derczynski Leon, 2017, P 3 WORKSH NOIS US G, P140, DOI 10.18653/v1/W17-4418,eprint:https://aclanthology.org/W17-4418.pdf

[10]

Ding N, 2021, Arxiv, DOI arXiv:2105.07464

← 1 2 3 4 →