A Survey on Symbolic Knowledge Distillation of Large Language Models

被引:0
作者
Acharya, Kamal [1 ]
Velasquez, Alvaro [2 ]
Song, Houbing Herbert [1 ]
机构
[1] University of Maryland, Baltimore County, Security and Optimization for Networked Globe Laboratory (SONG Lab), Department of Information Systems, Baltimore, 21250, MD
[2] University of Colorado, Department of Computer Science, Boulder, 80309, CO
来源
IEEE Transactions on Artificial Intelligence | 2024年 / 5卷 / 12期
基金
美国国家科学基金会;
关键词
Large language models (LLMs); symbolic knowledge; symbolic knowledge distillation;
D O I
10.1109/TAI.2024.3428519
中图分类号
学科分类号
摘要
This survey article delves into the emerging and critical area of symbolic knowledge distillation in large language models (LLMs). As LLMs such as generative pretrained transformer-3 (GPT-3) and bidirectional encoder representations from transformers (BERT) continue to expand in scale and complexity, the challenge of effectively harnessing their extensive knowledge becomes paramount. This survey concentrates on the process of distilling the intricate, often implicit knowledge contained within these models into a more symbolic, explicit form. This transformation is crucial for enhancing the interpretability, efficiency, and applicability of LLMs. We categorize the existing research based on methodologies and applications, focusing on how symbolic knowledge distillation can be used to improve the transparency and functionality of smaller, more efficient artificial intelligence (AI) models. The survey discusses the core challenges, including maintaining the depth of knowledge in a comprehensible format, and explores the various approaches and techniques that have been developed in this field. We identify gaps in current research and potential opportunities for future advancements. This survey aims to provide a comprehensive overview of symbolic knowledge distillation in LLMs, spotlighting its significance in the progression toward more accessible and efficient AI systems. © 2024 IEEE.
引用
收藏
页码:5928 / 5948
页数:20
相关论文
共 202 条
[1]  
Petroni F., Et al., Language models as knowledge bases?, (2019)
[2]  
West P., Et al., Symbolic knowledge distillation: from general language models to commonsense models, (2021)
[3]  
Sclar M., West P., Kumar S., Tsvetkov Y., Choi Y., Referee: Reference-free sentence summarization with sharper controllability through symbolic knowledge distillation, (2022)
[4]  
Gulcehre C., Et al., Reinforced self-training (rest) for language modeling, (2023)
[5]  
Zhao W.X., Et al., A survey of large language models, (2023)
[6]  
Min B., Et al., Recent advances in natural language processing via large pre-trained language models: A survey, ACM Comput. Surv., 56, 2, pp. 1-40, (2023)
[7]  
Hadi M.U., Et al., Large language models: A comprehensive survey of its applications, challenges, limitations, and future prospects, Authorea Preprints, (2023)
[8]  
Chang Y., Et al., A survey on evaluation of large language models, (2023)
[9]  
Zan D., Et al., Large language models meet NL2CODE: A survey, Proc. 61st Annu. Meeting Assoc. Comput. Linguistics (Long Papers), 1, pp. 7443-7464, (2023)
[10]  
Kasneci E., Et al., ChatGPT for good? On opportunities and challenges of large language models for education, Learn. Individual Differences, 103, (2023)