Integrating Small Language Models with Retrieval-Augmented Generation in Computing Education: Key Takeaways, Setup, and Practical Insights

被引：0

作者：

Yu, Zezhu ^{[1
]}

Liu, Suqing ^{[2
]}

Denny, Paul ^{[3
]}

Bergen, Andreas ^{[1
]}

Liut, Michael ^{[1
]}

机构：

[1] Univ Toronto Mississauga, Mississauga, ON, Canada

[2] McMaster Univ, Hamilton, ON, Canada

[3] Univ Auckland, Auckland, New Zealand

来源：

PROCEEDINGS OF THE 56TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, SIGCSE TS 2025, VOL 1 | 2025年

基金：

加拿大自然科学与工程研究理事会;

关键词：

Small Language Models; Large Language Models; Retrieval-Augmented Generation; Milvus; Intelligence Concentration; Conversational Agent; Personalized AI Agent; Intelligent Tutoring System; Computing Education; Computer Science Education;

D O I：

暂无

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Leveraging a Large Language Model (LLM) for personalized learning in computing education is promising, yet cloud-based LLMs pose risks around data security and privacy. To address these concerns, we developed and deployed a locally stored Small Language Model (SLM) utilizing Retrieval-Augmented Generation (RAG) methods to support computing students' learning. Previous work has demonstrated that SLMs can match or surpass popular LLMs (gpt3.5-turbo and gpt-4-32k) in handling conversational data from a CS1 course. We deployed SLMs with RAG (SLM + RAG) in a large course with more than 250 active students, fielding nearly 2,000 student questions, while evaluating data privacy, scalability, and feasibility of local deployments. This paper provides a comprehensive guide for deploying SLM + RAG systems, detailing model selection, vector database choice, embedding methods, and pipeline frameworks. We share practical insights from our deployment, including scalability concerns, accuracy versus context length trade-offs, guardrails and hallucination reduction, as well as data privacy maintenance. We address the "Impossible Triangle" in RAG systems, which states that achieving high accuracy, short context length, and low time consumption simultaneously is not feasible. Furthermore, our novel RAG framework, Intelligence Concentration (IC), categorizes information into multiple layers of abstraction within Milvus collections mitigating trade-offs and enabling educational assistants to deliver more relevant and personalized responses to students quickly.

引用

页码：1302 / 1308

页数：7

共 12 条

[1] Integrating Small Language Models with Retrieval-Augmented Generation in Computing Education: Key Takeaways, Setup, and Practical Insights
Yu, Zezhu
Liu, Suqing
Denny, Paul
Bergen, Andreas
Liut, Michael
PROCEEDINGS OF THE 56TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, SIGCSE TS 2025, VOL 2, 2025, : 1302 - 1308
[2] Can Small Language Models With Retrieval-Augmented Generation Replace Large Language Models When Learning Computer Science?
Liu, Suqing
Yu, Zezhu
Huang, Feiran
Bulbulia, Yousef
Bergen, Andreas
Liut, Michael
PROCEEDINGS OF THE 2024 CONFERENCE INNOVATION AND TECHNOLOGY IN COMPUTER SCIENCE EDUCATION, VOL 1, ITICSE 2024, 2024, : 388 - 393
[3] Enhancement of the Performance of Large Language Models inDiabetes Education through Retrieval-Augmented Generation:Comparative Study
Wang, Dingqiao
Liang, Jiangbo
Ye, Jinguo
Li, Jingni
Li, Jingpeng
Zhang, Qikai
Hu, Qiuling
Pan, Caineng
Wang, Dongliang
Liu, Zhong
Shi, Wen
Shi, Danli
Li, Fei
Qu, Bo
Zheng, Yingfeng
JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
[4] Quantitative Evaluation of Using Large Language Models and Retrieval-Augmented Generation in Computer Science Education
Wang, Kevin Shukang
Lawrence, Ramon
PROCEEDINGS OF THE 56TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, SIGCSE TS 2025, VOL 2, 2025, : 1183 - 1189
[5] Quantitative Evaluation of Using Large Language Models and Retrieval-Augmented Generation in Computer Science Education
Wang, Kevin Shukang
Lawrence, Ramon
PROCEEDINGS OF THE 56TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, SIGCSE TS 2025, VOL 1, 2025, : 1183 - 1189
[6] Adaptive Control of Retrieval-Augmented Generation for Large Language Models Through Reflective Tags
Yao, Chengyuan
Fujita, Satoshi
ELECTRONICS, 2024, 13 (23):
[7] Facilitating university admission using a chatbot based on large language models with retrieval-augmented generation
Chen, Zheng
Zou, Di
Xie, Haoran
Lou, Huajie
Pang, Zhiyuan
EDUCATIONAL TECHNOLOGY & SOCIETY, 2024, 27 (04): : 454 - 470
[8] Optimizing Microservice Deployment in Edge Computing with Large Language Models: Integrating Retrieval Augmented Generation and Chain of Thought Techniques
Feng, Kan
Luo, Lijun
Xia, Yongjun
Luo, Bin
He, Xingfeng
Li, Kaihong
Zha, Zhiyong
Xu, Bo
Peng, Kai
SYMMETRY-BASEL, 2024, 16 (11):
[9] Layered Query Retrieval: An Adaptive Framework for Retrieval-Augmented Generation in Complex Question Answering for Large Language Models
Huang, Jie
Wang, Mo
Cui, Yunpeng
Liu, Juan
Chen, Li
Wang, Ting
Li, Huan
Wu, Jinming
APPLIED SCIENCES-BASEL, 2024, 14 (23):
[10] Unveiling the Power of Large Language Models: A Comparative Study of Retrieval-Augmented Generation, Fine-Tuning, and Their Synergistic Fusion for Enhanced Performance
Budakoglu, Gulsum
Emekci, Hakan
IEEE ACCESS, 2025, 13 : 30936 - 30951

← 1 2 →