Enhanced automated code vulnerability repair using large language models

被引：2

作者：

de-Fitero-Dominguez, David ^{[1
]}

Garcia-Lopez, Eva ^{[1
]}

Garcia-Cabot, Antonio ^{[1
]}

Martinez-Herraiz, Jose-Javier ^{[1
]}

机构：

[1] Univ Alcala, Dept Ciencias Computac, Alcala De Henares 28805, Spain

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2024年 / 138卷

关键词：

Automated code repair; Deep learning; Large language models; Vulnerability repair; Mistral; Code llama;

D O I：

10.1016/j.engappai.2024.109291

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This research addresses the complex challenge of automated repair of code vulnerabilities, vital for enhancing digital security in an increasingly technology-driven world. The study introduces a novel and efficient format for the representation of code modification, using advanced Large Language Models (LLMs) such as Code Llama and Mistral. These models, fine-tuned on datasets featuring C/C++ code vulnerabilities, significantly improve the accuracy and adaptability of automated code repair techniques. A key finding is the enhanced repair accuracy of these models when compared to previous methods such as VulRepair, which underscores their practical utility and efficiency. The research also offers a critical assessment of current evaluation metrics, such as "Perfect Predictions", and their limitations in reflecting the true capabilities of automated repair models in real-world scenarios. Following this, it underscores the importance of using test datasets devoid of train samples, emphasizing the need for dataset integrity to enhance the effectiveness of LLMs in code repair tasks. The significance of this work is its contribution to digital security, setting new standards for automated code vulnerability repair and paving the way for future advancements in the fields of cybersecurity and artificial intelligence. The study does not only highlight the potential of LLMs in enhancing code security but also fosters further exploration and research in these crucial areas.

引用

页数：13

共 50 条

[1] Impact of Code Language Models on Automated Program Repair
Jiang, Nan
Liu, Kevin
Lutellier, Thibaud
Tan, Lin
2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 1430 - 1442
[2] Evaluating Large Language Models for Real-World Vulnerability Repair in C/C plus plus Code
Zhang, Lan
Zou, Qingtian
Singhal, Anoop
Sun, Xiaoyan
Liu, Peng
PROCEEDINGS OF THE 10TH ACM INTERNATIONAL WORKSHOP ON SECURITY AND PRIVACY ANALYTICS, IWSPA 2024, 2024, : 49 - 58
[3] DrPlanner: Diagnosis and Repair of Motion Planners for Automated Vehicles Using Large Language Models
Lin, Yuanfei
Li, Chenran
Ding, Mingyu
Tomizuka, Masayoshi
Zhan, Wei
Althoff, Matthias
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (10): : 8218 - 8225
[4] Automated Program Repair Using Generative Models for Code Infilling
Koutcheme, Charles
Sarsa, Sami
Leinonen, Juho
Hellas, Arto
Denny, Paul
ARTIFICIAL INTELLIGENCE IN EDUCATION, AIED 2023, 2023, 13916 : 798 - 803
[5] Evaluating Large Language Models for Automated CPT Code Prediction in Endovascular Neurosurgery
Roy, Joanna M.
Self, D. Mitchell
Isch, Emily
Musmar, Basel
Lan, Matthews
Keppetipola, Kavantissa
Koduri, Sravanthi
Pontarelli, Mary-Katharine
Tjoumakaris, Stavropoula I.
Gooch, M. Reid
Rosenwasser, Robert H.
Jabbour, Pascal M.
JOURNAL OF MEDICAL SYSTEMS, 2025, 49 (01)
[6] KARGEN: Knowledge-Enhanced Automated Radiology Report Generation Using Large Language Models
Li, Yingshu
Wang, Zhanyu
Liu, Yunyi
Wang, Lei
Liu, Lingqiao
Zhou, Luping
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT V, 2024, 15005 : 382 - 392
[7] Investigating large language models capabilities for automatic code repair in Python']Python
Omari, Safwan
Basnet, Kshitiz
Wardat, Mohammad
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (08): : 10717 - 10731
[8] FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categories and Test Code Repair
Fatima, Sakina
Hemmati, Hadi
C. Briand, Lionel
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2024, 50 (12) : 3146 - 3171
[9] Code Detection for Hardware Acceleration Using Large Language Models
Martinez, Pablo Antonio
Bernabe, Gregorio
Garcia, Jose Manuel
IEEE ACCESS, 2024, 12 : 35271 - 35281
[10] Evaluating Impact of Conventional Code Analysis Against Large Language Models in API Vulnerability Detection
Yildirim, Recep
Aydin, Kerem
Cetin, Orcun
PROCEEDINGS OF THE 2024 EUROPEAN INTERDISCIPLINARY CYBERSECURITY CONFERENCE, EICC 2024, 2024, : 57 - 64

← 1 2 3 4 5 →