Large Language Models and Simple, Stupid Bugs

被引：11

作者：

Jesse, Kevin ^{[1
]}

Ahmed, Toufique ^{[1
]}

Devanbu, Premkumar T. ^{[1
]}

Morgan, Emily ^{[1
]}

机构：

[1] Univ Calif Davis, Davis, CA 95616 USA

来源：

2023 IEEE/ACM 20TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR | 2023年

基金：

美国国家科学基金会;

关键词：

language models; prompting; deep learning; software engineering;

D O I：

10.1109/MSR59073.2023.00082

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

With the advent of powerful neural language models, AI-based systems to assist developers in coding tasks are becoming widely available; Copilot is one such system. Copilot uses Codex, a large language model (LLM), to complete code conditioned on a preceding "prompt". Codex, however, is trained on public GitHub repositories, viz., on code that may include bugs and vulnerabilities. Previous studies [1], [2] show Codex reproduces vulnerabilities seen in training. In this study, we examine how prone Codex is to generate an interesting bug category, single statement bugs, commonly referred to as simple, stupid bugs or SStuBs in the MSR community. We find that Codex and similar LLMs do help avoid some SStuBs, but do produce known, verbatim SStuBs as much as 2x as likely than known, verbatim correct code. We explore the consequences of the Codex generated SStuBs and propose avoidance strategies that suggest the possibility of reducing the production of known, verbatim SStubs, and increase the possibility of producing known, verbatim fixes.

引用

页码：563 / 575

页数：13

共 50 条

[31] Navigation with Large Language Models: Semantic Guesswork as a Heuristic for Planning
Shah, Dhruv
Equi, Michael
Osinski, Blazej
Xia, Fei
Ichter, Brian
Levine, Sergey
CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229
[32] Framework for evaluating code generation ability of large language models
Yeo, Sangyeop
Ma, Yu-Seung
Kim, Sang Cheol
Jun, Hyungkook
Kim, Taeho
ETRI JOURNAL, 2024, 46 (01) : 106 - 117
[33] Towards an understanding of large language models in software engineering tasks
Zheng, Zibin
Ning, Kaiwen
Zhong, Qingyuan
Chen, Jiachi
Chen, Wenqing
Guo, Lianghong
Wang, Weicheng
Wang, Yanlin
EMPIRICAL SOFTWARE ENGINEERING, 2025, 30 (02)
[34] Reimagining Self-Adaptation in the Age of Large Language Models
Donakanti, Raghav
Jain, Prakhar
Kulkarni, Shubham
Vaidhyanathan, Karthik
IEEE 21ST INTERNATIONAL CONFERENCE ON SOFTWARE ARCHITECTURE COMPANION, ICSA-C 2024, 2024, : 171 - 174
[35] Large Language Models for Software Engineering: A Systematic Mapping Study
Gormez, Muhammet Kursat
Yilmaz, Murat
Clarke, Paul M.
SYSTEMS, SOFTWARE AND SERVICES PROCESS IMPROVEMENT, EUROSPI 2024, PT I, 2024, 2179 : 64 - 79
[36] Clinical and Surgical Applications of Large Language Models: A Systematic Review
Pressman, Sophia M.
Borna, Sahar
Gomez-Cabello, Cesar A.
Haider, Syed Ali
Haider, Clifton R.
Forte, Antonio Jorge
JOURNAL OF CLINICAL MEDICINE, 2024, 13 (11)
[37] Large Language Models for Software Engineering: A Systematic Literature Review
Hou, Xinyi
Zhao, Yanjie
Liu, Yue
Yang, Zhou
Wang, Kailong
Li, Li
Luo, Xiapu
Lo, David
Grundy, John
Wang, Haoyu
ACM Transactions on Software Engineering and Methodology, 2024, 33 (08)
[38] The ambiguity of BERTology: what do large language models represent?
Tommi Buder-Gröndahl
Synthese, 203
[39] Why Large Language Models will (not) Kill Software Engineering Research
Di Penta, Massimiliano
PROCEEDINGS OF 2024 28TH INTERNATION CONFERENCE ON EVALUATION AND ASSESSMENT IN SOFTWARE ENGINEERING, EASE 2024, 2024, : 5 - 5
[40] Missed Connections: Lateral Thinking Puzzles for Large Language Models
Todd, Graham
Merino, Tim
Earle, Sam
Togelius, Julian
2024 IEEE CONFERENCE ON GAMES, COG 2024, 2024,

← 1 2 3 4 5 →