Balancing Security and Correctness in Code Generation: An Empirical Study on Commercial Large Language Models

被引:2
作者
Black, Gavin S. [1 ]
Rimal, Bhaskar P. [2 ]
Vaidyan, Varghese Mathew [1 ]
机构
[1] Dakota State Univ, Beacom Coll Comp & Cyber Sci, Madison, SD 57042 USA
[2] Univ Idaho, Dept Comp Sci, Moscow, ID 83844 USA
来源
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE | 2025年 / 9卷 / 01期
关键词
Codes; Security; Testing; Task analysis; Software; Logic; Computational intelligence; Code generation; code security; CWE; large language models; prompt engineering; vulnerability;
D O I
10.1109/TETCI.2024.3446695
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models (LLMs) continue to be adopted for a multitude of previously manual tasks, with code generation as a prominent use. Multiple commercial models have seen wide adoption due to the accessible nature of the interface. Simple prompts can lead to working solutions that save developers time. However, the generated code has a significant challenge with maintaining security. There are no guarantees on code safety, and LLM responses can readily include known weaknesses. To address this concern, our research examines different prompt types for shaping responses from code generation tasks to produce safer outputs. The top set of common weaknesses is generated through unconditioned prompts to create vulnerable code across multiple commercial LLMs. These inputs are then paired with different contexts, roles, and identification prompts intended to improve security. Our findings show that the inclusion of appropriate guidance reduces vulnerabilities in generated code, with the choice of model having the most significant effect. Additionally, timings are presented to demonstrate the efficiency of singular requests that limit the number of model interactions.
引用
收藏
页码:419 / 430
页数:12
相关论文
共 42 条
[1]  
Aho AV., 1986, Compilers: Principles, Techniques, and Tools
[2]   Role of Chat GPT in Public Health [J].
Biswas, Som S. .
ANNALS OF BIOMEDICAL ENGINEERING, 2023, 51 (05) :868-869
[3]  
Brown AlanW., 2004, SOFTWARE SYSTEM MODE, V3, P314, DOI DOI 10.1007/S10270-004-0061-2
[4]  
Cao Mingdeng, 2023, arXiv
[5]  
Cass S., 2022, Top Programming Languages 2022
[6]  
Chan A., 2023, ARXIV
[7]  
Chen Xi, 2023, ARXIV
[8]  
Cheshkov P., 2023, ARXIV
[9]  
GitHub, 2023, GITHUB COPILOT NOV 3
[10]  
Hajipour T., 2023, ARXIV