Enhanced automated code vulnerability repair using large language models

被引:2
|
作者
de-Fitero-Dominguez, David [1 ]
Garcia-Lopez, Eva [1 ]
Garcia-Cabot, Antonio [1 ]
Martinez-Herraiz, Jose-Javier [1 ]
机构
[1] Univ Alcala, Dept Ciencias Computac, Alcala De Henares 28805, Spain
关键词
Automated code repair; Deep learning; Large language models; Vulnerability repair; Mistral; Code llama;
D O I
10.1016/j.engappai.2024.109291
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This research addresses the complex challenge of automated repair of code vulnerabilities, vital for enhancing digital security in an increasingly technology-driven world. The study introduces a novel and efficient format for the representation of code modification, using advanced Large Language Models (LLMs) such as Code Llama and Mistral. These models, fine-tuned on datasets featuring C/C++ code vulnerabilities, significantly improve the accuracy and adaptability of automated code repair techniques. A key finding is the enhanced repair accuracy of these models when compared to previous methods such as VulRepair, which underscores their practical utility and efficiency. The research also offers a critical assessment of current evaluation metrics, such as "Perfect Predictions", and their limitations in reflecting the true capabilities of automated repair models in real-world scenarios. Following this, it underscores the importance of using test datasets devoid of train samples, emphasizing the need for dataset integrity to enhance the effectiveness of LLMs in code repair tasks. The significance of this work is its contribution to digital security, setting new standards for automated code vulnerability repair and paving the way for future advancements in the fields of cybersecurity and artificial intelligence. The study does not only highlight the potential of LLMs in enhancing code security but also fosters further exploration and research in these crucial areas.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Accelerating Pharmacovigilance using Large Language Models
    Prakash, Mukkamala Venkata Sai
    Parab, Ganesh
    Veeramalla, Meghana
    Reddy, Siddartha
    Varun, V.
    Gopalakrishnan, Saisubramaniam
    Pagidipally, Vishal
    Vaddina, Vishal
    PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 1182 - 1183
  • [22] MAGECODE: Machine-Generated Code Detection Method Using Large Language Models
    Pham, Hung
    Ha, Huyen
    Tong, Van
    Hoang, Dung
    Tran, Duc
    Le, Tuyen Ngoc
    IEEE ACCESS, 2024, 12 : 190186 - 190202
  • [23] Transforming the field of Vulnerability Prediction: Are Large Language Models the key?
    Siavvas, Miltiadis
    Kalouptsoglou, Ilias
    Gelenbe, Erol
    Kehagias, Dionysios
    Tzovaras, Dimitrios
    2024 32ND INTERNATIONAL CONFERENCE ON MODELING, ANALYSIS AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS, MASCOTS 2024, 2024, : 194 - 199
  • [24] Automated Research Review Support Using Machine Learning, Large Language Models, and Natural Language Processing
    Pendyala, Vishnu S.
    Kamdar, Karnavee
    Mulchandani, Kapil
    ELECTRONICS, 2025, 14 (02):
  • [25] Using Large Language Models to Improve Sentiment Analysis in Latvian Language
    Purvins, Pauls
    Urtans, Evalds
    Caune, Vairis
    BALTIC JOURNAL OF MODERN COMPUTING, 2024, 12 (02): : 165 - 175
  • [26] Automated legal consulting in construction procurement using metaheuristically optimized large language models
    Liu, Chi-Yun
    Chou, Jui-Sheng
    AUTOMATION IN CONSTRUCTION, 2025, 170
  • [27] Automated Scoring of Constructed Response Items in Math Assessment Using Large Language Models
    Morris, Wesley
    Holmes, Langdon
    Choi, Joon Suh
    Crossley, Scott
    INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE IN EDUCATION, 2024,
  • [28] A Survey of Natural Language-Based Editing of Low-Code Applications Using Large Language Models
    Gorissen, Simon Cornelius
    Sauer, Stefan
    Beckmann, Wolf G.
    HUMAN-CENTERED SOFTWARE ENGINEERING, HCSE 2024, 2024, 14793 : 243 - 254
  • [29] Large Language Models for Code: Security Hardening and Adversarial Testing
    He, Jingxuan
    Vechev, Martin
    PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023, 2023, : 1865 - 1879
  • [30] RMCBENCH: Benchmarking Large Language Models' Resistance to Malicious Code
    Chen, Jiachi
    Zhong, Qingyuan
    Wang, Yanlin
    Ning, Kaiwen
    Liu, Yongkun
    Xu, Zenan
    Zhao, Zhe
    Chen, Ting
    Zheng, Zibin
    PROCEEDINGS OF 2024 39TH ACM/IEEE INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2024, 2024, : 995 - 1006