Can Small Language Models With Retrieval-Augmented Generation Replace Large Language Models When Learning Computer Science?

被引：4

作者：

Liu, Suqing ^{[1
]}

Yu, Zezhu ^{[1
]}

Huang, Feiran ^{[1
]}

Bulbulia, Yousef ^{[1
]}

Bergen, Andreas ^{[1
]}

Liut, Michael ^{[1
]}

机构：

[1] Univ Toronto Mississauga, Mississauga, ON, Canada

来源：

PROCEEDINGS OF THE 2024 CONFERENCE INNOVATION AND TECHNOLOGY IN COMPUTER SCIENCE EDUCATION, VOL 1, ITICSE 2024 | 2024年

关键词：

Small Language Models; Retrieval Augmented Generation; Large Language Models; Intelligence Concentration; Conversational Agent; Personalized AI Agent; Locally Deployable AI; Intelligent Tutoring System; Intelligent Teaching Assistant; CS1; Computing Education;

D O I：

10.1145/3649217.3653554

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Leveraging Large Language Models (LLMs) for personalized learning and support is becoming a promising tool in computing education. AI Assistants can help students with programming, problem-solving, converse with them to clarify course content, explain error messages to help with debugging, and much more. However, using cloud-based LLMs poses risks around data security, privacy, but also control of the overarching system. To address these concerns, we created a locally-stored Small Language Model (SLM) that leverages different Retrieval-Augmented Generation (RAG) methods to support computing students' learning. We compare one SLM (neural-chat-7b-v3 - fine-tuned version of Mistral-7B-v0.1) against two popular LLMs (gpt-3.5-turbo and gpt-4-32k) to see the viability for computing educators to use in their course(s). We use conversations from a CS1 course (N = 1, 260), providing students with an AI Assistant (using gpt-3.5-turbo) to help them learn content and support problem-solving while completing their Python programming assignment. In total, we had 269 students use the AI Assistant, with a total of 1, 988 questions asked. Using this real conversational data, we re-ran student questions using our novel SLM (neural-chat-7b-v3 testing nine different RAG methods) and gpt-4-32k, then compared those results against the original gpt-3.5-turbo responses. Our findings indicate that using an SLM with RAG can perform similarly, if not better, than LLMs. This shows that it is possible for computing educators to use SLMs (with RAG) in their course(s) as a tool for scalable learning, supporting content understanding and problem-solving needs, while employing their own policies on data privacy and security.

引用

页码：388 / 393

页数：6

共 45 条

[1] Al-Hossami E., 2023, P 18 WORKSHOP INNOVA, P709
[2] Asai A, 2023, Arxiv, DOI arXiv:2310.11511
[3] Baldassarre MT, 2023, P 2023 ACM C INF TEC, P363, DOI DOI 10.1145/3582515.3609555
[4] Banner R, 2019, ADV NEUR IN, V32
[5] Bellettini Carlo, 2023, CSEDU 2023 15 INT C, V2, P59, DOI DOI 10.5220/0012007500003470
[6] Braunschweiler N, 2023, Arxiv, DOI [arXiv:2309.11838, 10.48550/arXiv.2309.11838, DOI 10.48550/ARXIV.2309.11838]
[7] Chen JW, 2023, Arxiv, DOI [arXiv:2309.01431, 10.48550/arXiv.2309.01431]
[8] Chen M., 2021, arXiv
[9] Chen WH, 2022, Arxiv, DOI arXiv:2209.14491
[10] Chung HW, 2022, Arxiv, DOI [arXiv:2210.11416, DOI 10.48550/ARXIV.2210.11416]

← 1 2 3 4 5 →