Enhanced automated code vulnerability repair using large language models

被引:2
作者
de-Fitero-Dominguez, David [1 ]
Garcia-Lopez, Eva [1 ]
Garcia-Cabot, Antonio [1 ]
Martinez-Herraiz, Jose-Javier [1 ]
机构
[1] Univ Alcala, Dept Ciencias Computac, Alcala De Henares 28805, Spain
关键词
Automated code repair; Deep learning; Large language models; Vulnerability repair; Mistral; Code llama;
D O I
10.1016/j.engappai.2024.109291
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This research addresses the complex challenge of automated repair of code vulnerabilities, vital for enhancing digital security in an increasingly technology-driven world. The study introduces a novel and efficient format for the representation of code modification, using advanced Large Language Models (LLMs) such as Code Llama and Mistral. These models, fine-tuned on datasets featuring C/C++ code vulnerabilities, significantly improve the accuracy and adaptability of automated code repair techniques. A key finding is the enhanced repair accuracy of these models when compared to previous methods such as VulRepair, which underscores their practical utility and efficiency. The research also offers a critical assessment of current evaluation metrics, such as "Perfect Predictions", and their limitations in reflecting the true capabilities of automated repair models in real-world scenarios. Following this, it underscores the importance of using test datasets devoid of train samples, emphasizing the need for dataset integrity to enhance the effectiveness of LLMs in code repair tasks. The significance of this work is its contribution to digital security, setting new standards for automated code vulnerability repair and paving the way for future advancements in the fields of cybersecurity and artificial intelligence. The study does not only highlight the potential of LLMs in enhancing code security but also fosters further exploration and research in these crucial areas.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Leveraging Large Language Models for Automated Chinese Essay Scoring
    Feng, Haiyue
    Du, Sixuan
    Zhu, Gaoxia
    Zou, Yan
    Poh Boon Phua
    Feng, Yuhong
    Zhong, Haoming
    Shen, Zhiqi
    Liu, Siyuan
    ARTIFICIAL INTELLIGENCE IN EDUCATION, PT I, AIED 2024, 2024, 14829 : 454 - 467
  • [42] Automated Paper Screening for Clinical Reviews Using Large Language Models: Data Analysis Study
    Guo, Eddie
    Gupta, Mehul
    Deng, Jiawen
    Park, Ye-Jean
    Paget, Michael
    Naugler, Christopher
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
  • [43] Enhanced Database Interaction using Large Language Models for Improved Data Retrieval and Analysis
    Usha, V
    Abhinash, Nalagarla Chiru
    Chowdary, Sakhamuri Nitin
    Sathya, V
    Reddy, Eeda Ramakrishna
    Priya, Sathiya S.
    2024 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT CYBER PHYSICAL SYSTEMS AND INTERNET OF THINGS, ICOICI 2024, 2024, : 1302 - 1306
  • [44] On Hardware Security Bug Code Fixes by Prompting Large Language Models
    Ahmad, Baleegh
    Thakur, Shailja
    Tan, Benjamin
    Karri, Ramesh
    Pearce, Hammond
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 4043 - 4057
  • [45] On the Effectiveness of Large Language Models in Statement-level Code Summarization
    Zhu, Jie
    Miao, Yun
    Xu, Tingting
    Zhu, Junwu
    Sun, Xiaolei
    2024 IEEE 24TH INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, 2024, : 216 - 227
  • [46] Towards Minimal Edits in Automated Program Repair: A Hybrid Framework Integrating Graph Neural Networks and Large Language Models
    Xu, Zhenyu
    Sheng, Victor S.
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT V, 2024, 15020 : 402 - 416
  • [47] Mapping Source Code to Software Architecture by Leveraging Large Language Models
    Johansson, Nils
    Caporuscio, Mauro
    Olsson, Tobias
    SOFTWARE ARCHITECTURE, ECSA 2024 TRACKS AND WORKSHOPS, 2024, 14937 : 133 - 149
  • [48] Benchmarking Causal Study to Interpret Large Language Models for Source Code
    Rodriguez-Cardenas, Daniel
    Palacio, David N.
    Khati, Dipin
    Burke, Henry
    Poshyvanyk, Denys
    2023 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION, ICSME, 2023, : 329 - 334
  • [49] Flexible and Secure Code Deployment in Federated Learning using Large Language Models: Prompt Engineering to Enhance Malicious Code Detection
    Seo, Jungwon
    Zhang, Nan
    Rong, Chunming
    2023 IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE, CLOUDCOM 2023, 2023, : 341 - 349
  • [50] Comparing Large Language Models and Human Programmers for Generating Programming Code
    Hou, Wenpin
    Ji, Zhicheng
    ADVANCED SCIENCE, 2025, 12 (08)