Vulnerability Detection with Fine-Grained Interpretations

被引:161
作者
Li, Yi [1 ]
Wang, Shaohua [1 ]
Nguyen, Tien N. [2 ]
机构
[1] New Jersey Inst Technol, Newark, NJ 07102 USA
[2] Univ Texas Dallas, Richardson, TX 75083 USA
来源
PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21) | 2021年
基金
美国国家科学基金会;
关键词
Vulnerability Detection; Deep Learning; Intelligence Assistant; Explainable AI (XAI); Interpretable AI;
D O I
10.1145/3468264.3468597
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Despite the successes of machine learning (ML) and deep learning (DL) based vulnerability detectors (VD), they are limited to providing only the decision on whether a given code is vulnerable or not, without details on what part of the code is relevant to the detected vulnerability. We present IVDETECT, an interpretable vulnerability detector with the philosophy of using Artificial Intelligence (AI) to detect vulnerabilities, while using Intelligence Assistant (IA) to provide VD interpretations in terms of vulnerable statements. For vulnerability detection, we separately consider the vulnerable statements and their surrounding contexts via data and control dependencies. This allows our model better discriminate vulnerable statements than using the mixture of vulnerable code and contextual code as in existing approaches. In addition to the coarse-grained vulnerability detection result, we leverage interpretable AI to provide users with fine-grained interpretations that include the sub-graph in the Program Dependency Graph (PDG) with the crucial statements that are relevant to the detected vulnerability. Our empirical evaluation on vulnerability databases shows that IVDETECT outperforms the existing DL-based approaches by 43%-84% and 105%-255% in top-10 nDCG and MAP ranking scores. IVDETECT correctly points out the vulnerable statements relevant to the vulnerability via its interpretation in 67% of the cases with a top-5 ranked list. IVDETECT improves over the baseline interpretation models by 12.3%-400% and 9%-400% in accuracy.
引用
收藏
页码:292 / 303
页数:12
相关论文
共 25 条
[1]  
[Anonymous], Food Additives Contaminants
[2]  
Chakraborty S., 2020, Deep Learning based Vulnerability Detection: Are We There Yet?
[3]   A C/C plus plus Code Vulnerability Dataset with Code Changes and CVE Summaries [J].
Fan, Jiahao ;
Li, Yi ;
Wang, Shaohua ;
Nguyen, Tien N. .
2020 IEEE/ACM 17TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2020, :508-512
[4]  
Harer J., 2018, Advances in Neural Information Processing Systems, P7933
[5]  
Harer J.A., 2018, Automated software vulnerability detection with machine learn- ing
[6]   A Critical Evaluation of Spectrum-Based Fault Localization Techniques on a Large-Scale Software System [J].
Keller, Fabian ;
Grunske, Lars ;
Heiden, Simon ;
Filieri, Antonio ;
van Hoorn, Andre ;
Lo, David .
2017 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY (QRS), 2017, :114-125
[7]  
Kipf TN, 2016, ARXIV
[8]   Improving Bug Detection via Context-Based Code Representation Learning and Attention-Based Neural Networks [J].
Li, Yi ;
Wang, Shaohua ;
Nguyen, Tien N. ;
Son Van Nguyen .
PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2019, 3 (OOPSLA)
[9]   VulDeePecker: A Deep Learning-Based System for Vulnerability Detection [J].
Li, Zhen ;
Zou, Deqing ;
Xu, Shouhuai ;
Ou, Xinyu ;
Jin, Hai ;
Wang, Sujuan ;
Deng, Zhijun ;
Zhong, Yuyi .
25TH ANNUAL NETWORK AND DISTRIBUTED SYSTEM SECURITY SYMPOSIUM (NDSS 2018), 2018,
[10]  
Microsoft, NEUR NETW INT