Fine-Tuning a Large Language Model with Reinforcement Learning for Educational Question Generation

被引：0

作者：

Lamsiyah, Salima ^{[1
]}

El Mahdaouy, Abdelkader ^{[2
]}

Nourbakhsh, Aria ^{[1
]}

Schommer, Christoph ^{[1
]}

机构：

[1] Univ Luxembourg, Fac Sci Technol & Med, Dept Comp Sci, Esch Sur Alzette, Luxembourg

[2] Mohammed VI Polytech Univ, Coll Comp, Ben Guerir, Morocco

来源：

ARTIFICIAL INTELLIGENCE IN EDUCATION, PT I, AIED 2024 | 2024年 / 14829卷

关键词：

Educational Question Generation; Large Language Model; Google FLAN-T5; Reinforcement Learning; Self-Critical Sequence Training;

D O I：

10.1007/978-3-031-64302-6_30

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Educational Natural Language Generation (EduQG) aims to automatically generate educational questions from textual content, which is crucial for the expansion of online education. Prior research in EduQG has predominantly relied on cross-entropy loss for training, which can lead to issues such as exposure bias and inconsistencies between training and testing metrics. To mitigate this issue, we propose a reinforcement learning (RL) based large language model (LLM) for educational question generation. In particular, we fine-tune the Google FLAN-T5 model using a mixed objective function that combines cross-entropy and RL losses to ensure the generation of questions that are syntactically and semantically accurate. The experimental results on the SciQ question generation dataset show that the proposed method is competitive with current state-of-the-art systems in terms of predictive performance and linguistic quality.

引用

页码：424 / 438

页数：15

共 50 条

[11] Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning
Xu, Runxin
Luo, Fuli
Zhang, Zhiyuan
Tan, Chuanqi
Chang, Baobao
Huang, Songfang
Huang, Fei
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9514 - 9528
[12] Balancing Speciality and Versatility: a Coarse to Fine Framework for Supervised Fine-tuning Large Language Model
Zhang, Hengyuan
Wu, Yanru
Li, Dawei
Yang, Sak
Zhao, Rui
Jiang, Yong
Tan, Fei
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 7467 - 7509
[13] Scalable Online Planning via Reinforcement Learning Fine-Tuning
Fickinger, Arnaud
Hu, Hengyuan
Amos, Brandon
Russell, Stuart
Brown, Noam
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[14] Fine-Tuning a Personalized OpenBioLLM Using Offline Reinforcement Learning
Shi, Jinsheng
Yuan, Yuyu
Wang, Ao
Nie, Meng
APPLIED SCIENCES-BASEL, 2025, 15 (05):
[15] On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning
Mandi, Zhao
Abbeel, Pieter
James, Stephen
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[16] How fine can fine-tuning be? Learning efficient language models
Radiya-Dixit, Evani
Wang, Xin
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 2435 - 2442
[17] Patent classification by fine-tuning BERT language model
Lee, Jieh-Sheng
Hsiang, Jieh
WORLD PATENT INFORMATION, 2020, 61
[18] Knowledge Graph Fusion for Language Model Fine-Tuning
Bhana, Nimesh
van Zyl, Terence L.
2022 9TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE, ISCMI, 2022, : 167 - 172
[19] Universal Language Model Fine-tuning for Text Classification
Howard, Jeremy
Ruder, Sebastian
PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 328 - 339
[20] Demystifying Instruction Mixing for Fine-tuning Large Language Models
Wang, Renxi
Li, Haonan
Wu, Minghao
Wang, Yuxia
Han, Xudong
Zhang, Chiyu
Baldwin, Timothy
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 4: STUDENT RESEARCH WORKSHOP, 2024, : 86 - 93

← 1 2 3 4 5 →