Can traditional fault prediction models be used for vulnerability prediction?

被引:151
作者
Shin, Yonghee [1 ]
Williams, Laurie [2 ]
机构
[1] Depaul Univ, Chicago, IL 60604 USA
[2] N Carolina State Univ, Raleigh, NC 27695 USA
基金
美国国家科学基金会;
关键词
Software metrics; Complexity metrics; Fault prediction; Vulnerability prediction; Open source project; Automated text classification; STATIC CODE ATTRIBUTES; SOFTWARE;
D O I
10.1007/s10664-011-9190-8
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Finding security vulnerabilities requires a different mindset than finding general faults in software-thinking like an attacker. Therefore, security engineers looking to prioritize security inspection and testing efforts may be better served by a prediction model that indicates security vulnerabilities rather than faults. At the same time, faults and vulnerabilities have commonalities that may allow development teams to use traditional fault prediction models and metrics for vulnerability prediction. The goal of our study is to determine whether fault prediction models can be used for vulnerability prediction or if specialized vulnerability prediction models should be developed when both models are built with traditional metrics of complexity, code churn, and fault history. We have performed an empirical study on a widely-used, large open source project, the Mozilla Firefox web browser, where 21% of the source code files have faults and only 3% of the files have vulnerabilities. Both the fault prediction model and the vulnerability prediction model provide similar ability in vulnerability prediction across a wide range of classification thresholds. For example, the fault prediction model provided recall of 83% and precision of 11% at classification threshold 0.6 and the vulnerability prediction model provided recall of 83% and precision of 12% at classification threshold 0.5. Our results suggest that fault prediction models based upon traditional metrics can substitute for specialized vulnerability prediction models. However, both fault prediction and vulnerability prediction models require significant improvement to reduce false positives while providing high recall.
引用
收藏
页码:25 / 59
页数:35
相关论文
共 43 条
[1]   Measuring, analyzing and predicting security vulnerabilities in software systems [J].
Alhazmi, O. H. ;
Malaiya, Y. K. ;
Ray, I. .
COMPUTERS & SECURITY, 2007, 26 (03) :219-228
[2]  
[Anonymous], 2002, 7007011 RTI NAT I ST
[3]  
Antoniol G, 2008, 2008 C CTR ADV STUD
[4]  
Arisholm E., 2006, ISESE 06 P 2006 ACMI, P8
[5]   Data mining techniques for building fault-proneness models in telecom Java']Java softwarea [J].
Arisholm, Erik ;
Biland, Lionel C. ;
Fuglerud, Magnus .
ISSRE 2007: 18TH IEEE INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, PROCEEDINGS, 2007, :215-+
[6]   A validation of object-oriented design metrics as quality indicators [J].
Basili, VR ;
Briand, LC ;
Melo, WL .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1996, 22 (10) :751-761
[7]  
Crews-Meyer KA, 2004, GEOCARTO INT, V19
[8]  
FITZPATRICKLINS K, 1981, PHOTOGRAMM ENG REM S, V47, P343
[9]  
Gegick M, 2009, INT S ENG SEC SOFTW
[10]  
Gegick M., 2008, P 4 ACM WORKSHOP QUA