LineVul: A Transformer-based Line-Level Vulnerability Prediction

被引:138
作者
Fu, Michael [1 ]
Tantithamthavorn, Chakkrit [1 ]
机构
[1] Monash Univ, Clayton, Vic, Australia
来源
2022 MINING SOFTWARE REPOSITORIES CONFERENCE (MSR 2022) | 2022年
基金
澳大利亚研究理事会;
关键词
D O I
10.1145/3524842.3528452
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Software vulnerabilities are prevalent in software systems, causing a variety of problems including deadlock, information loss, or system failures. Thus, early predictions of software vulnerabilities are critically important in safety-critical software systems. Various ML/DL-based approaches have been proposed to predict vulnerabilities at the file/function/method level. Recently, IVDetect (a graph-based neural network) is proposed to predict vulnerabilities at the function level. Yet, the IVDetect approach is still inaccurate and coarse-grained. In this paper, we propose LINEVUL, a Transformer-based line-level vulnerability prediction approach in order to address several limitations of the state-of-the-art IVDetect approach. Through an empirical evaluation of a large-scale real-world dataset with 188k+ C/C++ functions, we show that LINEVUL achieves (1) 160%-379% higher F1-measure for function-level predictions; (2) 12%-25% higher Top-10 Accuracy for line-level predictions; and (3) 29%-53% less Effort@20%Recall than the baseline approaches, highlighting the significant advancement of LINEVUL towards more accurate and more cost-effective line-level vulnerability predictions. Our additional analysis also shows that our LINEVUL is also very accurate (75%-100%) for predicting vulnerable functions affected by the Top-25 most dangerous CWEs, highlighting the potential impact of our LINEVUL in real-world usage scenarios.
引用
收藏
页码:608 / 620
页数:13
相关论文
共 64 条
  • [1] Ancona Marco, 2018, 6 INT C LEARNING REP
  • [2] [Anonymous], Microsoft Exchange Flaw: Attacks Surge After Code Published
  • [3] [Anonymous], COST CYBERCRIME
  • [4] [Anonymous], CYBERCRIME COST WORL
  • [5] [Anonymous], 2011, BIGLEARN NIPS WORKSH
  • [6] [Anonymous], IVDECTECT REPLICATIO
  • [7] Chakraborty Saikat, 2021, IEEE T SOFTWARE ENG
  • [8] Checkmarx, US
  • [9] Cppcheck, US
  • [10] cwe, CWE787