How Can I Get It Right? Using GPT to Rephrase Incorrect Trainee Responses

被引：3

作者：

Lin, Jionghao ^{[1
]}

Han, Zifei ^{[1
]}

Thomas, Danielle R. ^{[1
]}

Gurung, Ashish ^{[1
]}

Gupta, Shivang ^{[1
]}

Aleven, Vincent ^{[1
]}

Koedinger, Kenneth R. ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Human Comp Interact Inst, 5000 Forbes Ave, Pittsburgh, PA 15213 USA

来源：

INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE IN EDUCATION | 2024年

基金：

美国安德鲁·梅隆基金会;

关键词：

Large language models; Feedback; Tutoring training; ChatGPT; GPT-4; FEEDBACK;

D O I：

10.1007/s40593-024-00408-y

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

One-on-one tutoring is widely acknowledged as an effective instructional method, conditioned on qualified tutors. However, the high demand for qualified tutors remains a challenge, often necessitating the training of novice tutors (i.e., trainees) to ensure effective tutoring. Research suggests that providing timely explanatory feedback can facilitate the training process for trainees. However, it presents challenges due to the time-consuming nature of assessing trainee performance by human experts. Inspired by the recent advancements of large language models (LLMs), our study employed the GPT-4 model to build an explanatory feedback system. This system identifies trainees' responses in binary form (i.e., correct/incorrect) and automatically provides template-based feedback with responses appropriately rephrased by the GPT-4 model. We conducted our study using the responses of 383 trainees from three training lessons (Giving Effective Praise, Reacting to Errors, and Determining What Students Know). Our findings indicate that: 1) using a few-shot approach, the GPT-4 model effectively identifies correct/incorrect trainees' responses from three training lessons with an average F1 score of 0.84 and AUC score of 0.85; and 2) using the few-shot approach, the GPT-4 model adeptly rephrases incorrect trainees' responses into desired responses, achieving performance comparable to that of human experts.

引用

页码：482 / 508

页数：27

共 35 条

[11] Henderson M, 2019, IMPACT OF FEEDBACK IN HIGHER EDUCATION: IMPROVING ASSESSMENT OUTCOMES FOR LEARNERS, P267, DOI 10.1007/978-3-030-25112-3_15
[12] Hirunyasiri D, 2023, Arxiv, DOI [arXiv:2307.02018, 10.48550/arXiv.2307.02018]
[13] An astonishing regularity in student learning rate
Koedinger, Kenneth R.
Carvalho, Paulo F.
Liu, Ran
McLaughlin, Elizabeth A.
[J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2023, 120 (13)
[14] A Blueprint for Scaling Tutoring and Mentoring Across Public Schools
Kraft, Matthew A.
Falken, Grace T.
[J]. AERA OPEN, 2021, 7
[15] Levonian Z, 2023, Arxiv, DOI [arXiv:2310.03184, DOI 10.48550/ARXIV.2310.03184]
[16] Li Y., 2023, Computers Education: Artificial Intelligence, V4, P100140, DOI DOI 10.1016/J.CAEAI.2023.100140
[17] Lin J., 2023, LAK23, P100
[18] Lin JH, 2023, Arxiv, DOI arXiv:2306.15498
[19] On the role of politeness in online human-human tutoring
Lin, Jionghao
Rakovic, Mladen
Xie, Haoran
Lang, David
Gasevic, Dragan
Chen, Guanliang
Li, Yuheng
[J]. BRITISH JOURNAL OF EDUCATIONAL TECHNOLOGY, 2024, 55 (01) : 156 - 180
[20] Exploring the Politeness of Instructional Strategies from Human-Human Online Tutoring Dialogues
Lin, Jionghao
Rakovic, Mladen
Lang, David
Gasevic, Dragan
Chen, Guanliang
[J]. LAK22 CONFERENCE PROCEEDINGS: THE TWELFTH INTERNATIONAL CONFERENCE ON LEARNING ANALYTICS & KNOWLEDGE, 2022, : 282 - 293

← 1 2 3 4 →