Contrastive Learning for Morphological Disambiguation Using Large Language Models in Low-Resource Settings

被引：2

作者：

Tolegen, Gulmira ^{[1
,2
]}

Toleu, Alymzhan ^{[1
,2
]}

Mussabayev, Rustam ^{[1
,2
]}

机构：

[1] Satbayev Univ, AI Res Lab, Alma Ata 050040, Kazakhstan

[2] Inst Informat & Computat Technol, Lab Anal & Modelling Informat Proc, Alma Ata 050010, Kazakhstan

来源：

APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 21期

关键词：

morphological disambiguation; large language models; low-resource language; contrastive learning;

D O I：

10.3390/app14219992

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

In this paper, a contrastive learning approach for morphological disambiguation (MD) using large language models (LLMs) is presented. A contrastive loss function is introduced for training the approach, which reduces the distance between the correct analysis and contextual embeddings while maintaining a margin between correct and incorrect embeddings. One of the aims of the paper is to analyze the effects of fine-tuning an LLM on MD in morphologically complex languages (MCLs) with special reference to low-resource languages such as Kazakh, as well as Turkish. Another goal of the paper is to consider various distance measures for this contrastive loss function, aiming to achieve better results when performing disambiguation by computing the distance between the context and the analysis embeddings. The existing approaches for morphological disambiguation, such as HMM-based and feature-engineering approaches, have limitations in modeling long-term dependencies and in the case of large, sparse tagsets. These challenges are mitigated in the proposed approach by leveraging LLMs, thus achieving better accuracy in handling the cases of ambiguity and OOV tokens without the need to rely on other features. Experiments were conducted on three datasets for two MCLs, Kazakh and Turkish-the former is a typical low-resource language. The results revealed that the proposed approach with contrastive loss improves MD performance when integrated with knowledge from large language models.

引用

页数：17

共 28 条

[1]

Assylbekov Z., 2016, P 1 INT WORKSH TURK

[2]

Brown TB, 2020, ADV NEUR IN, V33

[3]

Çöltekin Ç, 2014, LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, P1079

[4]

Daybelge T., 2007, P REC ADV NAT LANG P

[5]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[6] Using Morpheme-level Attention Mechanism for Turkish Sequence Labelling [J].

Esref, Yasin ;

Can, Burcu .

2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2019,

[7] A Novel Approach to Morphological Disambiguation for Turkish [J].

Gorgun, Onur ;

Yildiz, Olcay Taner .

COMPUTER AND INFORMATION SCIENCES II, 2012, :77-83

[8]

Hakkani-Tur D.Z., P COLING 2000 18 INT

[9]

Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.8.1735, 10.1007/978-3-642-24797-2, 10.1162/neco.1997.9.1.1]

[10]

Oflazer K., 1996, P C EMP METH NAT LAN

← 1 2 3 →