Enhancing text understanding of decoder-based model by leveraging parameter-efficient fine-tuning method

被引:0
作者
Feroze, Wasif [1 ]
Cheng, Shaohuan [1 ]
Jimale, Elias Lemuye [1 ]
Jakhro, Abdul Naveed [2 ]
Qu, Hong [1 ]
机构
[1] School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu
[2] Department of Information Technology, Shaheed Benazir Bhutto University, Naushahro Feroze Campus, Naushahro Feroze
关键词
Large language models; MRC question answering; Natural language understanding; Parameter-efficient fine-tuning;
D O I
10.1007/s00521-025-10975-3
中图分类号
学科分类号
摘要
Machine reading comprehension (MRC) is a fundamental natural language understanding task in natural language processing, which aims to comprehend the text of a given passage and answer questions based on it. Understanding implicit information, deducing the logical structure of information, and connecting context from different pieces of information make the MRC task difficult. Most current state-of-the-art approaches for MRC are using encoder-based models. However, no earlier research proposed a decoder-only model for MRC question-answering datasets, although language models based on this category achieved unprecedented performance in different generative tasks. In this paper, we propose a parameter-efficient fine-tuning framework that effectively increases MRC capabilities on decoder-only large language models. This framework designs the process for MRC and introduces the low-rank adaptation (LoRA) method to effectively fine-tune the large model with many parameters, even with lower hardware resource requirements than the previous methods. In addition, we also integrate a quantized model inference strategy for the fine-tuned model to improve practicability further. We conducted experiments on four types of MRC datasets. After extensive experiments, our results show that our model achieved a significant performance boost over baselines and outperformed other strong models for MRC. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2025.
引用
收藏
页码:6899 / 6913
页数:14
相关论文
共 53 条
[21]  
Chen D., Neural Reading Comprehension and Beyond, (2018)
[22]  
Hao T., Li X., He Y., Wang F.L., Qu Y., Recent progress in leveraging deep learning methods for question answering, Neural Comput Appl, 34, 4, pp. 2765-2783, (2022)
[23]  
Ren M., Huang H., Gao Y., Interpretable modular knowledge reasoning for machine reading comprehension, Neural Comput Appl, 34, 12, pp. 9901-9918, (2022)
[24]  
Vanderwende L., Answering and questioning for machine reading, AAAI Spring Symposium, (2007)
[25]  
Rajpurkar P., Zhang J., Lopyrev K., Liang P., 100,000+ questions for machine comprehension of text, Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 2383-2392, (2016)
[26]  
Saeidi M., Bartolo M., Lewis P., Singh S., Rocktaschel T., Sheldon M., Bouchard G., Riedel S., Interpretation of natural language rules in conversational machine reading, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2087-2097, (2018)
[27]  
Jia R., Liang P., Adversarial examples for evaluating reading comprehension systems, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2021-2031, (2017)
[28]  
Weissenborn D., Wiese G., Seiffe L., Making neural QA as simple as possible but not simpler, Proceedings of the 21St Conference on Computational Natural Language Learning (Conll 2017), pp. 271-280, (2017)
[29]  
Peters M.E., Neumann M., Iyyer M., Gardner M., Clark C., Lee K., Zettlemoyer L., Deep Contextualized Word Representations., 1, pp. 2227-2237, (2018)
[30]  
Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser L., Polosukhin I., Attention is all you need, In: Proceedings of the 31St International Conference on Neural Information Processing Systems., (2017)