Enhanced automated code vulnerability repair using large language models

被引：2

作者：

de-Fitero-Dominguez, David ^{[1
]}

Garcia-Lopez, Eva ^{[1
]}

Garcia-Cabot, Antonio ^{[1
]}

Martinez-Herraiz, Jose-Javier ^{[1
]}

机构：

[1] Univ Alcala, Dept Ciencias Computac, Alcala De Henares 28805, Spain

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2024年 / 138卷

关键词：

Automated code repair; Deep learning; Large language models; Vulnerability repair; Mistral; Code llama;

D O I：

10.1016/j.engappai.2024.109291

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This research addresses the complex challenge of automated repair of code vulnerabilities, vital for enhancing digital security in an increasingly technology-driven world. The study introduces a novel and efficient format for the representation of code modification, using advanced Large Language Models (LLMs) such as Code Llama and Mistral. These models, fine-tuned on datasets featuring C/C++ code vulnerabilities, significantly improve the accuracy and adaptability of automated code repair techniques. A key finding is the enhanced repair accuracy of these models when compared to previous methods such as VulRepair, which underscores their practical utility and efficiency. The research also offers a critical assessment of current evaluation metrics, such as "Perfect Predictions", and their limitations in reflecting the true capabilities of automated repair models in real-world scenarios. Following this, it underscores the importance of using test datasets devoid of train samples, emphasizing the need for dataset integrity to enhance the effectiveness of LLMs in code repair tasks. The significance of this work is its contribution to digital security, setting new standards for automated code vulnerability repair and paving the way for future advancements in the fields of cybersecurity and artificial intelligence. The study does not only highlight the potential of LLMs in enhancing code security but also fosters further exploration and research in these crucial areas.

引用

页数：13

共 50 条

[21] Accelerating Pharmacovigilance using Large Language Models
Prakash, Mukkamala Venkata Sai
Parab, Ganesh
Veeramalla, Meghana
Reddy, Siddartha
Varun, V.
Gopalakrishnan, Saisubramaniam
Pagidipally, Vishal
Vaddina, Vishal
PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 1182 - 1183
[22] MAGECODE: Machine-Generated Code Detection Method Using Large Language Models
Pham, Hung
Ha, Huyen
Tong, Van
Hoang, Dung
Tran, Duc
Le, Tuyen Ngoc
IEEE ACCESS, 2024, 12 : 190186 - 190202
[23] Transforming the field of Vulnerability Prediction: Are Large Language Models the key?
Siavvas, Miltiadis
Kalouptsoglou, Ilias
Gelenbe, Erol
Kehagias, Dionysios
Tzovaras, Dimitrios
2024 32ND INTERNATIONAL CONFERENCE ON MODELING, ANALYSIS AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS, MASCOTS 2024, 2024, : 194 - 199
[24] Automated Research Review Support Using Machine Learning, Large Language Models, and Natural Language Processing
Pendyala, Vishnu S.
Kamdar, Karnavee
Mulchandani, Kapil
ELECTRONICS, 2025, 14 (02):
[25] Using Large Language Models to Improve Sentiment Analysis in Latvian Language
Purvins, Pauls
Urtans, Evalds
Caune, Vairis
BALTIC JOURNAL OF MODERN COMPUTING, 2024, 12 (02): : 165 - 175
[26] Automated legal consulting in construction procurement using metaheuristically optimized large language models
Liu, Chi-Yun
Chou, Jui-Sheng
AUTOMATION IN CONSTRUCTION, 2025, 170
[27] Automated Scoring of Constructed Response Items in Math Assessment Using Large Language Models
Morris, Wesley
Holmes, Langdon
Choi, Joon Suh
Crossley, Scott
INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE IN EDUCATION, 2024,
[28] A Survey of Natural Language-Based Editing of Low-Code Applications Using Large Language Models
Gorissen, Simon Cornelius
Sauer, Stefan
Beckmann, Wolf G.
HUMAN-CENTERED SOFTWARE ENGINEERING, HCSE 2024, 2024, 14793 : 243 - 254
[29] Large Language Models for Code: Security Hardening and Adversarial Testing
He, Jingxuan
Vechev, Martin
PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023, 2023, : 1865 - 1879
[30] RMCBENCH: Benchmarking Large Language Models' Resistance to Malicious Code
Chen, Jiachi
Zhong, Qingyuan
Wang, Yanlin
Ning, Kaiwen
Liu, Yongkun
Xu, Zenan
Zhao, Zhe
Chen, Ting
Zheng, Zibin
PROCEEDINGS OF 2024 39TH ACM/IEEE INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2024, 2024, : 995 - 1006

← 1 2 3 4 5 →