Fine-Tuning a Large Language Model with Reinforcement Learning for Educational Question Generation

被引：0

作者：

Lamsiyah, Salima ^{[1
]}

El Mahdaouy, Abdelkader ^{[2
]}

Nourbakhsh, Aria ^{[1
]}

Schommer, Christoph ^{[1
]}

机构：

[1] Univ Luxembourg, Fac Sci Technol & Med, Dept Comp Sci, Esch Sur Alzette, Luxembourg

[2] Mohammed VI Polytech Univ, Coll Comp, Ben Guerir, Morocco

来源：

ARTIFICIAL INTELLIGENCE IN EDUCATION, PT I, AIED 2024 | 2024年 / 14829卷

关键词：

Educational Question Generation; Large Language Model; Google FLAN-T5; Reinforcement Learning; Self-Critical Sequence Training;

D O I：

10.1007/978-3-031-64302-6_30

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Educational Natural Language Generation (EduQG) aims to automatically generate educational questions from textual content, which is crucial for the expansion of online education. Prior research in EduQG has predominantly relied on cross-entropy loss for training, which can lead to issues such as exposure bias and inconsistencies between training and testing metrics. To mitigate this issue, we propose a reinforcement learning (RL) based large language model (LLM) for educational question generation. In particular, we fine-tune the Google FLAN-T5 model using a mixed objective function that combines cross-entropy and RL losses to ensure the generation of questions that are syntactically and semantically accurate. The experimental results on the SciQ question generation dataset show that the proposed method is competitive with current state-of-the-art systems in terms of predictive performance and linguistic quality.

引用

页码：424 / 438

页数：15

共 50 条

[1] Comprehensive Review of Large Language Model Fine-Tuning
Zhang, Qintong
Wang, Yuchao
Wang, Hexi
Wang, Junxin
Chen, Hai
Computer Engineering and Applications, 2024, 60 (17) : 17 - 33
[2] Scaling Federated Learning for Fine-Tuning of Large Language Models
Hilmkil, Agrin
Callh, Sebastian
Barbieri, Matteo
Sutfeld, Leon Rene
Zec, Edvin Listo
Mogren, Olof
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2021), 2021, 12801 : 15 - 23
[3] CLEFT: Language-Image Contrastive Learning with Efficient Large Language Model and Prompt Fine-Tuning
Dui, Yuexi
Chang, Brian
Dvornek, Nicha C.
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT XII, 2024, 15012 : 465 - 475
[4] On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting
Korbak, Tomasz
Elsahar, Hady
Kruszewski, German
Dymetman, Marc
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[5] BloomLLM: Large Language Models Based Question Generation Combining Supervised Fine-Tuning and Bloom's Taxonomy
Nghia Duong-Trung
Wang, Xia
Kravcik, Milos
TECHNOLOGY ENHANCED LEARNING FOR INCLUSIVE AND EQUITABLE QUALITY EDUCATION, PT II, EC-TEL 2024, 2024, 15160 : 93 - 98
[6] MediBioDeBERTa: Biomedical Language Model With Continuous Learning and Intermediate Fine-Tuning
Kim, Eunhui
Jeong, Yuna
Choi, Myung-Seok
IEEE ACCESS, 2023, 11 : 141036 - 141044
[7] Phased Instruction Fine-Tuning for Large Language Models
Pang, Wei
Zhou, Chuan
Zhou, Xiao-Hua
Wang, Xiaojie
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 5735 - 5748
[8] HackMentor: Fine-Tuning Large Language Models for Cybersecurity
Zhang, Jie
Wen, Hui
Deng, Liting
Xin, Mingfeng
Li, Zhi
Li, Lun
Zhu, Hongsong
Sun, Limin
2023 IEEE 22ND INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, BIGDATASE, CSE, EUC, ISCI 2023, 2024, : 452 - 461
[9] Fine Tuning Large Language Model for Secure Code Generation
Li, Junjie
Sangalay, Aseem
Cheng, Cheng
Tian, Yuan
Yang, Jinqiu
PROCEEDINGS 2024 IEEE/ACM FIRST INTERNATIONAL CONFERENCE ON AI FOUNDATION MODELS AND SOFTWARE ENGINEERING, FORGE 2024, 2024, : 86 - 90
[10] Efficient fine-tuning of short text classification based on large language model
Wang, Likun
PROCEEDINGS OF INTERNATIONAL CONFERENCE ON MODELING, NATURAL LANGUAGE PROCESSING AND MACHINE LEARNING, CMNM 2024, 2024, : 33 - 38

← 1 2 3 4 5 →