Vulnerability Detection with Fine-Grained Interpretations

被引:132
作者
Li, Yi [1 ]
Wang, Shaohua [1 ]
Nguyen, Tien N. [2 ]
机构
[1] New Jersey Inst Technol, Newark, NJ 07102 USA
[2] Univ Texas Dallas, Richardson, TX 75083 USA
来源
PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21) | 2021年
基金
美国国家科学基金会;
关键词
Vulnerability Detection; Deep Learning; Intelligence Assistant; Explainable AI (XAI); Interpretable AI;
D O I
10.1145/3468264.3468597
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Despite the successes of machine learning (ML) and deep learning (DL) based vulnerability detectors (VD), they are limited to providing only the decision on whether a given code is vulnerable or not, without details on what part of the code is relevant to the detected vulnerability. We present IVDETECT, an interpretable vulnerability detector with the philosophy of using Artificial Intelligence (AI) to detect vulnerabilities, while using Intelligence Assistant (IA) to provide VD interpretations in terms of vulnerable statements. For vulnerability detection, we separately consider the vulnerable statements and their surrounding contexts via data and control dependencies. This allows our model better discriminate vulnerable statements than using the mixture of vulnerable code and contextual code as in existing approaches. In addition to the coarse-grained vulnerability detection result, we leverage interpretable AI to provide users with fine-grained interpretations that include the sub-graph in the Program Dependency Graph (PDG) with the crucial statements that are relevant to the detected vulnerability. Our empirical evaluation on vulnerability databases shows that IVDETECT outperforms the existing DL-based approaches by 43%-84% and 105%-255% in top-10 nDCG and MAP ranking scores. IVDETECT correctly points out the vulnerable statements relevant to the vulnerability via its interpretation in 67% of the cases with a top-5 ranked list. IVDETECT improves over the baseline interpretation models by 12.3%-400% and 9%-400% in accuracy.
引用
收藏
页码:292 / 303
页数:12
相关论文
共 26 条
  • [1] Chakraborty Saikat, 2020, Deep learning based vulnerability detection: Are we there yet
  • [2] Chung J, 2014, NIPS 2014 WORKSH DEE
  • [3] A C/C plus plus Code Vulnerability Dataset with Code Changes and CVE Summaries
    Fan, Jiahao
    Li, Yi
    Wang, Shaohua
    Nguyen, Tien N.
    [J]. 2020 IEEE/ACM 17TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2020, : 508 - 512
  • [4] Harer JA., 2018, ARXIV180304497
  • [5] Harer Jacob, 2018, Advances in Neural Information Processing Systems, P7933
  • [6] A Critical Evaluation of Spectrum-Based Fault Localization Techniques on a Large-Scale Software System
    Keller, Fabian
    Grunske, Lars
    Heiden, Simon
    Filieri, Antonio
    van Hoorn, Andre
    Lo, David
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY (QRS), 2017, : 114 - 125
  • [7] Kipf T.N., 2016, arXiv
  • [8] Improving Bug Detection via Context-Based Code Representation Learning and Attention-Based Neural Networks
    Li, Yi
    Wang, Shaohua
    Nguyen, Tien N.
    Son Van Nguyen
    [J]. PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2019, 3 (OOPSLA):
  • [9] VulDeePecker: A Deep Learning-Based System for Vulnerability Detection
    Li, Zhen
    Zou, Deqing
    Xu, Shouhuai
    Ou, Xinyu
    Jin, Hai
    Wang, Sujuan
    Deng, Zhijun
    Zhong, Yuyi
    [J]. 25TH ANNUAL NETWORK AND DISTRIBUTED SYSTEM SECURITY SYMPOSIUM (NDSS 2018), 2018,
  • [10] microsoft, NEUR NETW INT