Toward Low-Resource Languages Machine Translation: A Language-Specific Fine-Tuning With LoRA for Specialized Large Language Models

被引：0

作者：

Liang, Xiao ^{[1
,2
]}

Khaw, Yen-Min Jasmina ^{[1
]}

Liew, Soung-Yue ^{[3
]}

Tan, Tien-Ping ^{[4
]}

Qin, Donghong ^{[2
]}

机构：

[1] Univ Tunku Abdul Rahman, Fac Informat & Commun Technol, Dept Comp Sci, Kampar 31900, Malaysia

[2] Guangxi Minzu Univ, Sch Artificial Intelligence, Nanning 530008, Peoples R China

[3] Univ Tunku Abdul Rahman, Fac Informat & Commun Technol, Dept Comp & Commun Technol, Kampar 31900, Malaysia

[4] Univ Sains Malaysia, Sch Comp Sci, George Town 11700, Malaysia

来源：

IEEE ACCESS | 2025年 / 13卷

关键词：

Machine translation; low-resource languages; large language models; parameter-efficient fine-tuning; LoRA;

D O I：

10.1109/ACCESS.2025.3549795

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the field of computational linguistics, addressing machine translation (MT) challenges for low-resource languages remains crucial, as these languages often lack extensive data compared to high-resource languages. General large language models (LLMs), such as GPT-4 and Llama, primarily trained on monolingual corpora, face significant challenges in translating low-resource languages, often resulting in subpar translation quality. This study introduces Language-Specific Fine-Tuning with Low-rank adaptation (LSFTL), a method that enhances translation for low-resource languages by optimizing the multi-head attention and feed-forward networks of Transformer layers through low-rank matrix adaptation. LSFTL preserves the majority of the model parameters while selectively fine-tuning key components, thereby maintaining stability and enhancing translation quality. Experiments on non-English centered low-resource Asian languages demonstrated that LSFTL improved COMET scores by 1-3 points compared to specialized multilingual machine translation models. Additionally, LSFTL's parameter-efficient approach allows smaller models to achieve performance comparable to their larger counterparts, highlighting its significance in making machine translation systems more accessible and effective for low-resource languages.

引用

页码：46616 / 46626

页数：11

共 50 条

[1] adaptMLLM: Fine-Tuning Multilingual Language Models on Low-Resource Languages with Integrated LLM Playgrounds
Lankford, Seamus
Afli, Haithem
Way, Andy
INFORMATION, 2023, 14 (12)
[2] Efficient Fine-Tuning for Low-Resource Tibetan Pre-trained Language Models
Zhou, Mingjun
Daiqing, Zhuoma
Qun, Nuo
Nyima, Tashi
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT VII, 2024, 15022 : 410 - 422
[3] Improving Machine Translation Capabilities by Fine-Tuning Large Language Models and Prompt Engineering with Domain-Specific Data
Laki, Laszlo Janos
Yang, Zijian Gyozo
2024 IEEE 3RD CONFERENCE ON INFORMATION TECHNOLOGY AND DATA SCIENCE, CITDS 2024, 2024, : 129 - 133
[4] Repeatability of Fine-Tuning Large Language Models Illustrated Using QLoRA
Alahmari, Saeed S.
Hall, Lawrence O.
Mouton, Peter R.
Goldgof, Dmitry B.
IEEE ACCESS, 2024, 12 : 153221 - 153231
[5] Getting it right: the limits of fine-tuning large language models
Browning, Jacob
ETHICS AND INFORMATION TECHNOLOGY, 2024, 26 (02)
[6] Enhancing Chinese comprehension and reasoning for large language models: an efficient LoRA fine-tuning and tree of thoughts framework
Chen, Songlin
Wang, Weicheng
Chen, Xiaoliang
Zhang, Maolin
Lu, Peng
Li, Xianyong
Du, Yajun
JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
[7] Fine-Tuning Large Language Models for Private Document Retrieval: A Tutorial
Sommers, Frank
Kongthon, Alisa
Kongyoung, Sarawoot
PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1319 - 1320
[8] The Task of Post-Editing Machine Translation for the Low-Resource Language
Rakhimova, Diana
Karibayeva, Aidana
Turarbek, Assem
APPLIED SCIENCES-BASEL, 2024, 14 (02):
[9] A Comparative Analysis of Instruction Fine-Tuning Large Language Models for Financial Text Classification
Fatemi, Sorouralsadat
Hu, Yuheng
Mousavi, Maryam
ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS, 2025, 16 (01)
[10] Characterizing Communication in Distributed Parameter-Efficient Fine-Tuning for Large Language Models
Alnaasan, Nawras
Huang, Horng-Ruey
Shafi, Aamir
Subramoni, Hari
Panda, Dhabaleswar K.
2024 IEEE SYMPOSIUM ON HIGH-PERFORMANCE INTERCONNECTS, HOTI 2024, 2024, : 11 - 19

← 1 2 3 4 5 →