Eliciting analogical reasoning from language models in retrieval-augmented translation under low-resource scenarios

被引:0
作者
Wang, Liyan [1 ]
Wloka, Bartholomaus [2 ]
Lepage, Yves [1 ]
机构
[1] Waseda Univ, 2-7 Hibikino, Fukuoka 8080135, Japan
[2] Univ Vienna, Porzellangasse 4, A-1090 Vienna, Austria
基金
日本学术振兴会;
关键词
Retrieval-augmented neural machine; translation; Analogical reasoning; language models; multi-objective learning; Low-resource languages;
D O I
10.1016/j.neucom.2025.129680
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Retrieval-Augmented Neural Machine Translation (RANMT), which augments translation models with relevant examples fetched by a similarity retriever, is proficient in well-resourced translations. However, its inherent advantages are not fully realized in low-resource contexts, where the sparsity of data can often result in less relevant or less useful information for translation. Recent literature indicates that nearest-neighbor examples from small training data have the unfortunate effect of impairing RANMT performance. Our examination of 16 low-resource tasks reveals a sharp deterioration in performance of a multilingual language model when conditioned on retrieved examples compared to direct translation. To address this problem, we explore a framework based on analogical reasoning, aiming to enhance the capacity of language models to infer translations from parallel examples in limited data settings. This framework mimics a cognitive process of human translation by structuring examples in analogy patterns. We propose a multi-objective learning strategy that augments vanilla training for conditional translation to learn latent knowledge from examples. We also investigate different retrieval methods for selecting translation examples based on lexical similarity, semantic relatedness, or a combination of both. The results show that our approach is effective in optimizing RANMT in low-resource settings, delivering notable improvements across all retrieval settings. In particular, augmented training akin to reasoning with analogies in two directions, contributes significantly to deriving benefits from examples, even when their relevance is limited. Moreover, our approach demonstrates superior performance in low-resource translation tasks compared to prompting large language models in few-shot contexts. It also proves to be competitive with models that have been extensively trained using substantial amounts of supervised data.
引用
收藏
页数:18
相关论文
共 80 条
[61]  
Reimers N, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P3982
[62]  
Robertson S. E., 1995, Text REtrieval Conference (TREC-3) (NIST SP 500-225), P109
[63]  
Sellam T, 2020, P 58 ANN M ASS COMP, P7881, DOI [DOI 10.18653/V1/2020.ACL-MAIN.704.URL, 10.18653/V1/2020.ACL-MAIN.704]
[64]  
Snover M., 2006, P 7 C ASS MACHINE TR, P223
[65]  
Stap D., 2023, P 2023 C EMP METH NA, P9200, DOI [10.18653/v1/2023.emnlp-main.571, DOI 10.18653/V1/2023.EMNLP-MAIN.571]
[66]  
Stroppa N., 2005, P 9 C COMPUTATIONAL, P120, DOI DOI 10.3115/1706543.1706565
[67]  
Sultan Oren, 2022, P 2022 C EMP METH NA, P3547
[68]  
Tang YQ, 2021, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, P3450
[69]  
Tseng H., 2005, P 4 SIGHAN WORKSH CH, P168
[70]  
Ushio A, 2021, 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, P3609