Predicting Cyber Vulnerability Exploits with Machine Learning

被引:24
作者
Edkrantz, Michel [1 ]
Said, Alan [1 ]
机构
[1] Recorded Future AB, Gothenburg, Sweden
来源
THIRTEENTH SCANDINAVIAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (SCAI 2015) | 2015年 / 278卷
关键词
Machine Learning; SVM; cyber security; information security; exploits; vulnerability prediction; data mining;
D O I
10.3233/978-1-61499-589-0-48
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For an information security manager it can be a daunting task to keep up and assess which new cyber vulnerabilities to prioritize patching first. Every day numerous new vulnerabilities and exploits are reported for a wide variety of different software configurations. We use machine learning to make automatic predictions for unseen vulnerabilities based on previous exploit patterns. As sources for historic vulnerability data, we use the National Vulnerability Database (NVD) and the Exploit Database (EDB). Our work shows that common words from the vulnerability descriptions, external references, and vendor products, are the most important features to consider. Common Vulnerability Scoring System (CVSS) scores and categorical parameters, and Common Weakness Enumeration (CWE) numbers are redundant when a large number of common words are used, since this information is often contained within the vulnerability description. Using machine learning algorithms, it is possible to get a prediction accuracy of 83% for binary classification. In comparison, the performance differences between some of the algorithms are marginal with respect to metrics such as accuracy, precision, and recall. The best classifier with respect to both performance metrics and execution time is a linear time Support Vector Machine (SVM) algorithm. We conclude that in order to get better predictions the data quality must be enhanced.
引用
收藏
页码:48 / 57
页数:10
相关论文
共 15 条
[1]  
Allodi L., 2012, BADGERS 12
[2]  
Allodi L., 2013, IEEE S SEC PRIV
[3]  
[Anonymous], 2007, 1 FORUM INCIDENT RES
[4]  
[Anonymous], IEEE T EVOL COMP
[5]  
[Anonymous], 2012, Bayesian Reasoning and Machine Learning
[6]  
[Anonymous], 2012, MACHINE LEARNING PRO
[7]  
Bozorgi M., 2010, KDD 10
[8]  
Edkrantz Michel, 2015, THESIS
[9]  
Fan Rong-En, 2008, JMLR 9
[10]  
Frei S., 2009, WEIS