Enhanced automated code vulnerability repair using large language models

被引:2
|
作者
de-Fitero-Dominguez, David [1 ]
Garcia-Lopez, Eva [1 ]
Garcia-Cabot, Antonio [1 ]
Martinez-Herraiz, Jose-Javier [1 ]
机构
[1] Univ Alcala, Dept Ciencias Computac, Alcala De Henares 28805, Spain
关键词
Automated code repair; Deep learning; Large language models; Vulnerability repair; Mistral; Code llama;
D O I
10.1016/j.engappai.2024.109291
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This research addresses the complex challenge of automated repair of code vulnerabilities, vital for enhancing digital security in an increasingly technology-driven world. The study introduces a novel and efficient format for the representation of code modification, using advanced Large Language Models (LLMs) such as Code Llama and Mistral. These models, fine-tuned on datasets featuring C/C++ code vulnerabilities, significantly improve the accuracy and adaptability of automated code repair techniques. A key finding is the enhanced repair accuracy of these models when compared to previous methods such as VulRepair, which underscores their practical utility and efficiency. The research also offers a critical assessment of current evaluation metrics, such as "Perfect Predictions", and their limitations in reflecting the true capabilities of automated repair models in real-world scenarios. Following this, it underscores the importance of using test datasets devoid of train samples, emphasizing the need for dataset integrity to enhance the effectiveness of LLMs in code repair tasks. The significance of this work is its contribution to digital security, setting new standards for automated code vulnerability repair and paving the way for future advancements in the fields of cybersecurity and artificial intelligence. The study does not only highlight the potential of LLMs in enhancing code security but also fosters further exploration and research in these crucial areas.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Impact of Code Language Models on Automated Program Repair
    Jiang, Nan
    Liu, Kevin
    Lutellier, Thibaud
    Tan, Lin
    2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 1430 - 1442
  • [2] Evaluating Large Language Models for Real-World Vulnerability Repair in C/C plus plus Code
    Zhang, Lan
    Zou, Qingtian
    Singhal, Anoop
    Sun, Xiaoyan
    Liu, Peng
    PROCEEDINGS OF THE 10TH ACM INTERNATIONAL WORKSHOP ON SECURITY AND PRIVACY ANALYTICS, IWSPA 2024, 2024, : 49 - 58
  • [3] DrPlanner: Diagnosis and Repair of Motion Planners for Automated Vehicles Using Large Language Models
    Lin, Yuanfei
    Li, Chenran
    Ding, Mingyu
    Tomizuka, Masayoshi
    Zhan, Wei
    Althoff, Matthias
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (10): : 8218 - 8225
  • [4] Automated Program Repair Using Generative Models for Code Infilling
    Koutcheme, Charles
    Sarsa, Sami
    Leinonen, Juho
    Hellas, Arto
    Denny, Paul
    ARTIFICIAL INTELLIGENCE IN EDUCATION, AIED 2023, 2023, 13916 : 798 - 803
  • [5] Evaluating Large Language Models for Automated CPT Code Prediction in Endovascular Neurosurgery
    Roy, Joanna M.
    Self, D. Mitchell
    Isch, Emily
    Musmar, Basel
    Lan, Matthews
    Keppetipola, Kavantissa
    Koduri, Sravanthi
    Pontarelli, Mary-Katharine
    Tjoumakaris, Stavropoula I.
    Gooch, M. Reid
    Rosenwasser, Robert H.
    Jabbour, Pascal M.
    JOURNAL OF MEDICAL SYSTEMS, 2025, 49 (01)
  • [6] KARGEN: Knowledge-Enhanced Automated Radiology Report Generation Using Large Language Models
    Li, Yingshu
    Wang, Zhanyu
    Liu, Yunyi
    Wang, Lei
    Liu, Lingqiao
    Zhou, Luping
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT V, 2024, 15005 : 382 - 392
  • [7] Investigating large language models capabilities for automatic code repair in Python']Python
    Omari, Safwan
    Basnet, Kshitiz
    Wardat, Mohammad
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (08): : 10717 - 10731
  • [8] FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categories and Test Code Repair
    Fatima, Sakina
    Hemmati, Hadi
    C. Briand, Lionel
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2024, 50 (12) : 3146 - 3171
  • [9] Code Detection for Hardware Acceleration Using Large Language Models
    Martinez, Pablo Antonio
    Bernabe, Gregorio
    Garcia, Jose Manuel
    IEEE ACCESS, 2024, 12 : 35271 - 35281
  • [10] Evaluating Impact of Conventional Code Analysis Against Large Language Models in API Vulnerability Detection
    Yildirim, Recep
    Aydin, Kerem
    Cetin, Orcun
    PROCEEDINGS OF THE 2024 EUROPEAN INTERDISCIPLINARY CYBERSECURITY CONFERENCE, EICC 2024, 2024, : 57 - 64