Learning to Predict Severity of Software Vulnerability Using Only Vulnerability Description

被引:109
作者
Han, Zhuobing [1 ]
Li, Xiaohong [1 ]
Xing, Zhenchang [2 ]
Liu, Hongtao [1 ]
Feng, Zhiyong [3 ]
机构
[1] Tianjin Univ, Sch Comp Sci & Technol, Tianjin Key Lab Adv Networking TANK, Tianjin, Peoples R China
[2] Australian Natl Univ, Res Sch Comp Sci, Canberra, ACT, Australia
[3] Tianjin Univ, Sch Comp Software, Tianjin, Peoples R China
来源
2017 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME) | 2017年
基金
美国国家科学基金会;
关键词
vulnerability severity prediction; multi-class classification; deep learning; mining software repositories;
D O I
10.1109/ICSME.2017.52
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Software vulnerabilities pose significant security risks to the host computing system. Faced with continuous disclosure of software vulnerabilities, system administrators must prioritize their efforts, triaging the most critical vulnerabilities to address first. Many vulnerability scoring systems have been proposed, but they all require expert knowledge to determine intricate vulnerability metrics. In this paper, we propose a deep learning approach to predict multi-class severity level of software vulnerability using only vulnerability description. Compared with intricate vulnerability metrics, vulnerability description is the "surface level" information about how a vulnerability works. To exploit vulnerability description for predicting vulnerability severity, discriminative features of vulnerability description have to be defined. This is a challenging task due to the diversity of software vulnerabilities and the richness of vulnerability descriptions. Instead of relying on manual feature engineering, our approach uses word embeddings and a one-layer shallow Convolutional Neural Network (CNN) to automatically capture discriminative word and sentence features of vulnerability descriptions for predicting vulnerability severity. We exploit large amounts of vulnerability data from the Common Vulnerabilities and Exposures (CVE) database to train and test our approach.
引用
收藏
页码:125 / 136
页数:12
相关论文
共 56 条
[1]  
A. S. Advisories, SEV LEV SEC ISS
[2]   Measuring, analyzing and predicting security vulnerabilities in software systems [J].
Alhazmi, O. H. ;
Malaiya, Y. K. ;
Ray, I. .
COMPUTERS & SECURITY, 2007, 26 (03) :219-228
[3]  
Arora A., 2006, Inter- national Conference on Information Systems, P22
[4]  
Bengio Y, 2001, ADV NEUR IN, V13, P932
[5]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[6]  
Bozorgi M., 2010, P 16 INT C KNOWL DIS, P105, DOI DOI 10.1145/1835804.1835821
[7]  
Britz Denny, 2015, Implementing a cnn for text classification in tensorflow
[8]  
C. Microsoft, 2002, MICR SEC RESP CTR SE
[9]  
C. MITRE, COMM VULN EXP CVE
[10]   Do Bugs Foreshadow Vulnerabilities? A Study of the Chromium Project [J].
Camilo, Felivel ;
Meneely, Andrew ;
Nagappan, Meiyappan .
12TH WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2015), 2015, :269-279