Integrating Small Language Models with Retrieval-Augmented Generation in Computing Education: Key Takeaways, Setup, and Practical Insights

被引:0
|
作者
Yu, Zezhu [1 ]
Liu, Suqing [2 ]
Denny, Paul [3 ]
Bergen, Andreas [1 ]
Liut, Michael [1 ]
机构
[1] Univ Toronto Mississauga, Mississauga, ON, Canada
[2] McMaster Univ, Hamilton, ON, Canada
[3] Univ Auckland, Auckland, New Zealand
来源
PROCEEDINGS OF THE 56TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, SIGCSE TS 2025, VOL 1 | 2025年
基金
加拿大自然科学与工程研究理事会;
关键词
Small Language Models; Large Language Models; Retrieval-Augmented Generation; Milvus; Intelligence Concentration; Conversational Agent; Personalized AI Agent; Intelligent Tutoring System; Computing Education; Computer Science Education;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Leveraging a Large Language Model (LLM) for personalized learning in computing education is promising, yet cloud-based LLMs pose risks around data security and privacy. To address these concerns, we developed and deployed a locally stored Small Language Model (SLM) utilizing Retrieval-Augmented Generation (RAG) methods to support computing students' learning. Previous work has demonstrated that SLMs can match or surpass popular LLMs (gpt3.5-turbo and gpt-4-32k) in handling conversational data from a CS1 course. We deployed SLMs with RAG (SLM + RAG) in a large course with more than 250 active students, fielding nearly 2,000 student questions, while evaluating data privacy, scalability, and feasibility of local deployments. This paper provides a comprehensive guide for deploying SLM + RAG systems, detailing model selection, vector database choice, embedding methods, and pipeline frameworks. We share practical insights from our deployment, including scalability concerns, accuracy versus context length trade-offs, guardrails and hallucination reduction, as well as data privacy maintenance. We address the "Impossible Triangle" in RAG systems, which states that achieving high accuracy, short context length, and low time consumption simultaneously is not feasible. Furthermore, our novel RAG framework, Intelligence Concentration (IC), categorizes information into multiple layers of abstraction within Milvus collections mitigating trade-offs and enabling educational assistants to deliver more relevant and personalized responses to students quickly.
引用
收藏
页码:1302 / 1308
页数:7
相关论文
共 12 条
  • [1] Integrating Small Language Models with Retrieval-Augmented Generation in Computing Education: Key Takeaways, Setup, and Practical Insights
    Yu, Zezhu
    Liu, Suqing
    Denny, Paul
    Bergen, Andreas
    Liut, Michael
    PROCEEDINGS OF THE 56TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, SIGCSE TS 2025, VOL 2, 2025, : 1302 - 1308
  • [2] Can Small Language Models With Retrieval-Augmented Generation Replace Large Language Models When Learning Computer Science?
    Liu, Suqing
    Yu, Zezhu
    Huang, Feiran
    Bulbulia, Yousef
    Bergen, Andreas
    Liut, Michael
    PROCEEDINGS OF THE 2024 CONFERENCE INNOVATION AND TECHNOLOGY IN COMPUTER SCIENCE EDUCATION, VOL 1, ITICSE 2024, 2024, : 388 - 393
  • [3] Enhancement of the Performance of Large Language Models inDiabetes Education through Retrieval-Augmented Generation:Comparative Study
    Wang, Dingqiao
    Liang, Jiangbo
    Ye, Jinguo
    Li, Jingni
    Li, Jingpeng
    Zhang, Qikai
    Hu, Qiuling
    Pan, Caineng
    Wang, Dongliang
    Liu, Zhong
    Shi, Wen
    Shi, Danli
    Li, Fei
    Qu, Bo
    Zheng, Yingfeng
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
  • [4] Quantitative Evaluation of Using Large Language Models and Retrieval-Augmented Generation in Computer Science Education
    Wang, Kevin Shukang
    Lawrence, Ramon
    PROCEEDINGS OF THE 56TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, SIGCSE TS 2025, VOL 2, 2025, : 1183 - 1189
  • [5] Quantitative Evaluation of Using Large Language Models and Retrieval-Augmented Generation in Computer Science Education
    Wang, Kevin Shukang
    Lawrence, Ramon
    PROCEEDINGS OF THE 56TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, SIGCSE TS 2025, VOL 1, 2025, : 1183 - 1189
  • [6] Adaptive Control of Retrieval-Augmented Generation for Large Language Models Through Reflective Tags
    Yao, Chengyuan
    Fujita, Satoshi
    ELECTRONICS, 2024, 13 (23):
  • [7] Facilitating university admission using a chatbot based on large language models with retrieval-augmented generation
    Chen, Zheng
    Zou, Di
    Xie, Haoran
    Lou, Huajie
    Pang, Zhiyuan
    EDUCATIONAL TECHNOLOGY & SOCIETY, 2024, 27 (04): : 454 - 470
  • [8] Optimizing Microservice Deployment in Edge Computing with Large Language Models: Integrating Retrieval Augmented Generation and Chain of Thought Techniques
    Feng, Kan
    Luo, Lijun
    Xia, Yongjun
    Luo, Bin
    He, Xingfeng
    Li, Kaihong
    Zha, Zhiyong
    Xu, Bo
    Peng, Kai
    SYMMETRY-BASEL, 2024, 16 (11):
  • [9] Layered Query Retrieval: An Adaptive Framework for Retrieval-Augmented Generation in Complex Question Answering for Large Language Models
    Huang, Jie
    Wang, Mo
    Cui, Yunpeng
    Liu, Juan
    Chen, Li
    Wang, Ting
    Li, Huan
    Wu, Jinming
    APPLIED SCIENCES-BASEL, 2024, 14 (23):
  • [10] Unveiling the Power of Large Language Models: A Comparative Study of Retrieval-Augmented Generation, Fine-Tuning, and Their Synergistic Fusion for Enhanced Performance
    Budakoglu, Gulsum
    Emekci, Hakan
    IEEE ACCESS, 2025, 13 : 30936 - 30951