Few-shot training LLMs for project-specific code-summarization

被引:78
作者
Ahmed, Toufique [1 ]
Devanbu, Premkumar [1 ]
机构
[1] Univ Calif Davis, Davis, CA 95616 USA
来源
PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022 | 2022年
关键词
deep learning; code summarization; large language model;
D O I
10.1145/3551349.3559555
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Very large language models (LLMs), such as GPT-3 and Codex have achieved state-of-the-art performance on several natural-language tasks, and show great promise also for code. A particularly exciting aspect of LLMs is their knack for few-shot and zero-shot learning: they can learn to perform a task with very few examples. Few-shotting has particular synergies in software engineering, where there are a lot of phenomena (identifier names, APIs, terminology, coding patterns) that are known to be highly project-specific. However, project-specific data can be quite limited, especially early in the history of a project; thus the few-shot learning capacity of LLMs might be very relevant. In this paper, we investigate the use few-shot training with the very large GPT (Generative Pretrained Transformer) Codex model, and find evidence suggesting that one can significantly surpass state-of-the-art models for code-summarization, leveraging project-specific training.
引用
收藏
页数:5
相关论文
共 24 条
[1]  
Ahmad WU, 2021, 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), P2655
[2]   Multilingual training for Software Engineering [J].
Ahmed, Toufique ;
Devanbu, Premkumar .
2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), 2022, :1443-1455
[3]  
Bommasani R., 2022, On the opportunities and risks of foundation models, DOI [10.48550/arXiv.2108.07258, DOI 10.48550/ARXIV.2108.07258]
[4]  
Brown TB, 2020, ADV NEUR IN, V33
[5]  
Chen M., 2021, arXiv
[6]  
Devlin J, 2019, Arxiv, DOI [arXiv:1810.04805, 10.48550/arXiv.1810.04805]
[7]  
Feng ZY, 2020, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, P1536
[8]  
Fried D, 2023, Arxiv, DOI arXiv:2204.05999
[9]  
Guo D, 2020, INT C LEARN REPR
[10]   Are Deep Neural Networks the Best Choice for Modeling Source Code? [J].
Hellendoorn, Vincent J. ;
Devanbu, Premkumar .
ESEC/FSE 2017: PROCEEDINGS OF THE 2017 11TH JOINT MEETING ON FOUNDATIONS OF SOFTWARE ENGINEERING, 2017, :763-773