Feedback-Generation for Programming Exercises With GPT-4

被引:6
作者
Azaiz, Imen [1 ]
Kiesler, Natalie [2 ]
Strickroth, Sven [1 ]
机构
[1] Ludwig Maximilians Univ Munchen, Munich, Germany
[2] Nuremberg Tech, Nurnberg, Germany
来源
PROCEEDINGS OF THE 2024 CONFERENCE INNOVATION AND TECHNOLOGY IN COMPUTER SCIENCE EDUCATION, VOL 1, ITICSE 2024 | 2024年
关键词
formative feedback; personalized feedback; assessment; introductory programming; Large Language Models; LLMs; GPT-4; Turbo; benchmarking;
D O I
10.1145/3649217.3653594
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Ever since Large Language Models (LLMs) and related applications have become broadly available, several studies investigated their potential for assisting educators and supporting students in higher education. LLMs such as Codex, GPT-3.5, and GPT 4 have shown promising results in the context of large programming courses, where students can benefit from feedback and hints if provided timely and at scale. This paper explores the quality of GPT-4 Turbo's generated output for prompts containing both the programming task specification and a student's submission as input. Two assignments from an introductory programming course were selected, and GPT-4 was asked to generate feedback for 55 randomly chosen, authentic student programming submissions. The output was qualitatively analyzed regarding correctness, personalization, fault localization, and other features identified in the material. Compared to prior work and analyses of GPT-3.5, GPT-4 Turbo shows notable improvements. For example, the output is more structured and consistent. GPT-4 Turbo can also accurately identify invalid casing in student programs' output. In some cases, the feedback also includes the output of the student program. At the same time, inconsistent feedback was noted such as stating that the submission is correct but an error needs to be fixed. The present work increases our understanding of LLMs' potential, limitations, and how to integrate them into e-assessment systems, pedagogical scenarios, and instructing students who are using applications based on GPT-4.
引用
收藏
页码:31 / 37
页数:7
相关论文
共 50 条
  • [21] Diagnostic accuracy of GPT-4 on common clinical scenarios and challenging cases
    Rutledge, Geoffrey W.
    LEARNING HEALTH SYSTEMS, 2024, 8 (03):
  • [22] Literary characters and GPT-4: from William Shakespeare to Elena Ferrante
    Abrams, Gabriel
    DIGITAL SCHOLARSHIP IN THE HUMANITIES, 2024, : 1 - 14
  • [23] More Than Meets the AI: Evaluating the performance of GPT-4 on Computer Graphics assessment questions
    Feng, Tony Haoran
    Denny, Paul
    Wuensche, Burkhard C.
    Luxton-Reilly, Andrew
    Hooper, Steffan
    PROCEEDINGS OF THE 26TH AUSTRALASIAN COMPUTING EDUCATION CONFERENCE, ACE 2024, 2024, : 182 - 191
  • [24] LLMs Still Can't Avoid Instanceof: An Investigation Into GPT-3.5, GPT-4 and Bard's Capacity to Handle Object-Oriented Programming Assignments
    Cipriano, Bruno Pereira
    Alves, Pedro
    2024 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING EDUCATION AND TRAINING, ICSE-SEET 2024, 2024, : 162 - 169
  • [25] Will ChatGPT/GPT-4 be a Lighthouse to Guide Spinal Surgeons?
    Yongbin He
    Haifeng Tang
    Dongxue Wang
    Shuqin Gu
    Guoxin Ni
    Haiyang Wu
    Annals of Biomedical Engineering, 2023, 51 : 1362 - 1365
  • [26] The Emotional Intelligence of the GPT-4 Large Language Model
    Vzorin, Gleb D.
    Bukinich, Alexey M.
    Sedykh, Anna V.
    Vetrova, Irina I.
    Sergienko, Elena A.
    PSYCHOLOGY IN RUSSIA-STATE OF THE ART, 2024, 17 (02): : 85 - 99
  • [27] From GPT-3 to GPT-4: On the Evolving Efficacy of LLMs to Answer Multiple-Choice Questions for Programming Classes in Higher Education
    Savelka, Jaromir
    Agarwal, Arav
    Bogart, Christopher
    Sakr, Majd
    COMPUTER SUPPORTED EDUCATION, CSEDU 2023, 2024, 2052 : 160 - 182
  • [28] Using GPT-4 to guide causal machine learning
    Constantinou, Anthony C.
    Kitson, Neville K.
    Zanga, Alessio
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 268
  • [29] GPT-4带来的变化与挑战
    贵重
    李云翔
    王光涛
    电信工程技术与标准化, 2023, 36 (04) : 17 - 19
  • [30] ChatGPT and Patient Information in Nuclear Medicine: GPT-3.5 Versus GPT-4
    Currie, Geoff
    Robbie, Stephanie
    Tually, Peter
    JOURNAL OF NUCLEAR MEDICINE TECHNOLOGY, 2023, 51 (04) : 307 - 313