Feedback-Generation for Programming Exercises With GPT-4

被引：6

作者：

Azaiz, Imen ^{[1
]}

Kiesler, Natalie ^{[2
]}

Strickroth, Sven ^{[1
]}

机构：

[1] Ludwig Maximilians Univ Munchen, Munich, Germany

[2] Nuremberg Tech, Nurnberg, Germany

来源：

PROCEEDINGS OF THE 2024 CONFERENCE INNOVATION AND TECHNOLOGY IN COMPUTER SCIENCE EDUCATION, VOL 1, ITICSE 2024 | 2024年

关键词：

formative feedback; personalized feedback; assessment; introductory programming; Large Language Models; LLMs; GPT-4; Turbo; benchmarking;

D O I：

10.1145/3649217.3653594

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Ever since Large Language Models (LLMs) and related applications have become broadly available, several studies investigated their potential for assisting educators and supporting students in higher education. LLMs such as Codex, GPT-3.5, and GPT 4 have shown promising results in the context of large programming courses, where students can benefit from feedback and hints if provided timely and at scale. This paper explores the quality of GPT-4 Turbo's generated output for prompts containing both the programming task specification and a student's submission as input. Two assignments from an introductory programming course were selected, and GPT-4 was asked to generate feedback for 55 randomly chosen, authentic student programming submissions. The output was qualitatively analyzed regarding correctness, personalization, fault localization, and other features identified in the material. Compared to prior work and analyses of GPT-3.5, GPT-4 Turbo shows notable improvements. For example, the output is more structured and consistent. GPT-4 Turbo can also accurately identify invalid casing in student programs' output. In some cases, the feedback also includes the output of the student program. At the same time, inconsistent feedback was noted such as stating that the submission is correct but an error needs to be fixed. The present work increases our understanding of LLMs' potential, limitations, and how to integrate them into e-assessment systems, pedagogical scenarios, and instructing students who are using applications based on GPT-4.

引用

页码：31 / 37

页数：7

共 50 条

[41] Once Upon a GPT-4: Enhancing Diversity in Automated Reading Comprehension Story Generation with Classic Tales
Shankarnarayanan, Aadhith
Syed, Taufiq
Shapsough, Salsabeel
Zualkernan, Imran
2024 IEEE INTERNATIONAL CONFERENCE ON ADVANCED LEARNING TECHNOLOGIES, ICALT 2024, 2024, : 196 - 200
[42] Using the Retrieval-Augmented Generation Technique to Improve the Performance of GPT-4 in Answering Quran Questions
Alnefaie, Sarah
Atwell, Eric
Alsalka, Mohammed Ammar
2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 377 - 381
[43] Case study identification with GPT-4 and implications for mapping studies
Petersen, Kai
INFORMATION AND SOFTWARE TECHNOLOGY, 2024, 171
[44] Assessing GPT-4's accuracy in answering clinical pharmacological questions on pain therapy
Stroop, Anna
Stroop, Tabea
Alsofy, Samer Zawy
Wegner, Moritz
Nakamura, Makoto
Stroop, Ralf
BRITISH JOURNAL OF CLINICAL PHARMACOLOGY, 2025,
[45] GPT-3.5 Turbo and GPT-4 Turbo in Title and Abstract Screening for Systematic Reviews
Oami, Takehiko
Okada, Yohei
Nakada, Taka-aki
JMIR MEDICAL INFORMATICS, 2025, 13
[46] Evaluating Large Language Models for the National Premedical Exam in India: Comparative Analysis of GPT-3.5, GPT-4, and Bard
Farhat, Faiza
Chaudhry, Beenish Moalla
Nadeem, Mohammad
Sohail, Shahab Saquib
Madsen, Dag Oivind
JMIR MEDICAL EDUCATION, 2024, 10
[47] Evaluating GPT-4's Cognitive Functions Through the Bloom Taxonomy: Insights and Clarifications
Herrmann-Werner, Anne
Festl-Wietek, Teresa
Holderried, Friederike
Herschbach, Lea
Griewatz, Jan
Masters, Ken
Zipfel, Stephan
Mahling, Moritz
JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
[48] Revolutionizing Neurosurgery with GPT-4: A Leap Forward or Ethical Conundrum?
Wenbo Li
Mingshu Fu
Siyu Liu
Hongyu Yu
Annals of Biomedical Engineering, 2023, 51 : 2105 - 2112
[49] Performance of GPT-3.5 and GPT-4 on the Japanese Medical Licensing Examination: Comparison Study
Takagi, Soshi
Watari, Takashi
Erabi, Ayano
Sakaguchi, Kota
JMIR MEDICAL EDUCATION, 2023, 9
[50] GPT-4再燃热点拷问科技伦理边界
张渺
科学大观园, 2023, (08) : 54 - 57

← 1 2 3 4 5 →