LineVul: A Transformer-based Line-Level Vulnerability Prediction

被引:168
作者
Fu, Michael [1 ]
Tantithamthavorn, Chakkrit [1 ]
机构
[1] Monash Univ, Clayton, Vic, Australia
来源
2022 MINING SOFTWARE REPOSITORIES CONFERENCE (MSR 2022) | 2022年
基金
澳大利亚研究理事会;
关键词
D O I
10.1145/3524842.3528452
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Software vulnerabilities are prevalent in software systems, causing a variety of problems including deadlock, information loss, or system failures. Thus, early predictions of software vulnerabilities are critically important in safety-critical software systems. Various ML/DL-based approaches have been proposed to predict vulnerabilities at the file/function/method level. Recently, IVDetect (a graph-based neural network) is proposed to predict vulnerabilities at the function level. Yet, the IVDetect approach is still inaccurate and coarse-grained. In this paper, we propose LINEVUL, a Transformer-based line-level vulnerability prediction approach in order to address several limitations of the state-of-the-art IVDetect approach. Through an empirical evaluation of a large-scale real-world dataset with 188k+ C/C++ functions, we show that LINEVUL achieves (1) 160%-379% higher F1-measure for function-level predictions; (2) 12%-25% higher Top-10 Accuracy for line-level predictions; and (3) 29%-53% less Effort@20%Recall than the baseline approaches, highlighting the significant advancement of LINEVUL towards more accurate and more cost-effective line-level vulnerability predictions. Our additional analysis also shows that our LINEVUL is also very accurate (75%-100%) for predicting vulnerable functions affected by the Top-25 most dangerous CWEs, highlighting the potential impact of our LINEVUL in real-world usage scenarios.
引用
收藏
页码:608 / 620
页数:13
相关论文
共 64 条
[21]   An Empirical Study of Model-Agnostic Techniques for Defect Prediction Models [J].
Jiarpakdee, Jirayus ;
Tantithamthavorn, Chakkrit ;
Dam, Hoa Khanh ;
Grundy, John .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (01) :166-185
[22]   AutoSpearman: Automatically Mitigating Correlated Software Metrics for Interpreting Defect Models [J].
Jiarpakdee, Jirayus ;
Tantithamthavorn, Chakkrit ;
Treude, Christoph .
PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME), 2018, :92-103
[23]  
Jiarpakdee Jirayus, 2021, IEEE T SOFTWARE ENG
[24]  
Johnson A., 2011, NIST Spec. Publ., V800, P16
[25]   Big Code != Big Vocabulary: Open-Vocabulary Models for Source Code [J].
Karampatsis, Rafael-Michael ;
Babii, Hlib ;
Robbes, Romain ;
Sutton, Charles ;
Janes, Andrea .
2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020), 2020, :1073-1085
[26]   JITBot: An Explainable Just-In-Time Defect Prediction Bot [J].
Khanan, Chaiyakarn ;
Luewichana, Worawit ;
Pruktharathikoon, Krissakorn ;
Jiarpakdee, Jirayus ;
Tantithamthavorn, Chakkrit ;
Choetkiertikul, Morakot ;
Ragkhitwetsagul, Chaiyong ;
Sunetnanta, Thanwadee .
2020 35TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE 2020), 2020, :1336-1339
[27]  
Dam HK, 2017, Arxiv, DOI arXiv:1708.02368
[28]   Vulnerability Detection with Fine-Grained Interpretations [J].
Li, Yi ;
Wang, Shaohua ;
Nguyen, Tien N. .
PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21), 2021, :292-303
[29]  
Li Zhen, 2018, 25 ANN NETW DISTR
[30]   Implantable Sufficiently Integrated Multimodal Flexible Sensor for Intracranial Monitoring [J].
Liu, Tiezhu ;
Yao, Pan ;
Li, Zhou ;
Feng, Hongqing ;
Zhuang, Chengyu ;
Sun, Xuan ;
Liu, Chunxiu ;
Xue, Ning .
2021 IEEE SENSORS, 2021,