DLAP: A Deep Learning Augmented Large Language Model Prompting framework for software vulnerability detection

被引:1
作者
Yang, Yanjing [1 ]
Zhou, Xin [1 ]
Mao, Runfeng [1 ]
Xu, Jinwei [1 ]
Yang, Lanxin [1 ]
Zhang, Yu [1 ]
Shen, Haifeng [2 ]
Zhang, He [1 ]
机构
[1] Nanjing Univ, Software Inst, State Key Lab Novel Software Technol, Nanjing, Peoples R China
[2] Southern Cross Univ, Fac Sci & Engn, Lismore, Australia
基金
中国国家自然科学基金;
关键词
Vulnerability detection; Large Language Model; Prompting engineering; Framework; NEURAL-NETWORKS;
D O I
10.1016/j.jss.2024.112234
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Software vulnerability detection is generally supported by automated static analysis tools, which have recently been reinforced by deep learning (DL) models. However, despite the superior performance of DL-based approaches over rule-based ones in research, applying DL approaches to software vulnerability detection in practice remains a challenge. This is due to the complex structure of source code, the black-box nature of DL, and the extensive domain knowledge required to understand and validate the black-box results for addressing tasks after detection. Conventional DL models are trained by specific projects and, hence, excel in identifying vulnerabilities in these projects but not in others. These models with poor performance in vulnerability detection would impact the downstream tasks such as location and repair. More importantly, these models do not provide explanations for developers to comprehend detection results. In contrast, Large Language Models (LLMs) with prompting techniques achieve stable performance across projects and provide explanations for results. However, using existing prompting techniques, the detection performance of LLMs is relatively low and cannot be used for real-world vulnerability detections. This paper contributes DLAP, a D eep L earning A ugmented LLMs P rompting framework that combines the best of both DL models and LLMs to achieve exceptional vulnerability detection performance. Experimental evaluation results confirm that DLAP outperforms state-of-the-art prompting frameworks, including role-based prompts, auxiliary information prompts, chain-of-thought prompts, and in-context learning prompts, as well as fine-turning on multiple metrics.
引用
收藏
页数:15
相关论文
共 62 条
  • [1] Arakelyan S., 2023, P 22 C EMP METH NAT, P16298
  • [2] Bai YT, 2022, Arxiv, DOI arXiv:2212.08073
  • [3] Brown TB, 2020, ADV NEUR IN, V33
  • [4] Deep Learning Based Vulnerability Detection: Are We There Yet?
    Chakraborty, Saikat
    Krishna, Rahul
    Ding, Yangruibo
    Ray, Baishakhi
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (09) : 3280 - 3296
  • [5] Chen M, 2021, arXiv, DOI 10.48550/ARXIV.2107.03374
  • [6] Automated Identification of Libraries from Vulnerability Data
    Chen, Yang
    Santosa, Andrew E.
    Sharma, Asankhaya
    Lo, David
    [J]. 2020 IEEE/ACM 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE (ICSE-SEIP), 2020, : 90 - 99
  • [7] Cheshkov A, 2023, Arxiv, DOI arXiv:2304.07232
  • [8] Chowdhery A, 2023, J MACH LEARN RES, V24
  • [9] What Developers Want and Need from Program Analysis: An Empirical Study
    Christakis, Maria
    Bird, Christian
    [J]. 2016 31ST IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE), 2016, : 332 - 343
  • [10] Data Quality for Software Vulnerability Datasets
    Croft, Roland
    Babar, M. Ali
    Kholoosi, M. Mehdi
    [J]. 2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 121 - 133