Automatically Identifying Security Bug Reports via Multitype Features Analysis

被引:8
作者
Zou, Deqing [1 ,2 ]
Deng, Zhijun [1 ]
Li, Zhen [1 ,3 ]
Jin, Hai [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Serv Comp Technol & Syst Lab, Big Data Technol & Syst Lab,Cluster & Grid Comp L, Wuhan 430074, Peoples R China
[2] Shenzhen Huazhong Univ Sci & Technol, Res Inst, Shenzhen, Peoples R China
[3] Hebei Univ, Sch Cyber Secur & Comp, Baoding, Peoples R China
来源
INFORMATION SECURITY AND PRIVACY | 2018年 / 10946卷
基金
美国国家科学基金会;
关键词
Security bug identification; Bug report; Natural language processing; Machine learning;
D O I
10.1007/978-3-319-93638-3_35
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Bug-tracking systems are widely used by software developers to manage bug reports. Since it is time-consuming and costly to fix all the bugs, developers usually pay more attention to the bugs with higher impact, such as security bugs (i.e., vulnerabilities) which can be exploited by malicious users to launch attacks and cause great damages. However, manually identifying security bug reports from millions of reports in bug-tracking systems is difficult and error-prone. Furthermore, existing automated identification approaches to security bug reports often incur many false negatives, causing a hidden danger to the computer system. To address this important problem, we present an automatic security bug reports identification model via multitype features analysis, dubbed Security Bug Report Identifier (SBRer). Specifically, we make use of multiple kinds of information contained in a bug report, including meta features and textual features, to automatically identify the security bug reports via natural language processing and machine learning techniques. The experimental results show that SBRer with imbalanced data processing can successfully identify the security bug reports with a much higher precision of 99.4% and recall of 79.9% compared to existing work.
引用
收藏
页码:619 / 633
页数:15
相关论文
共 30 条
  • [1] [Anonymous], 2016, NDSS
  • [2] Anvik J., 2006, P 28 INT C SOFTW ENG, P361, DOI DOI 10.1145/1134285.1134336
  • [3] Behl D, 2014, PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON RELIABILTY, OPTIMIZATION, & INFORMATION TECHNOLOGY (ICROIT 2014), P294, DOI 10.1109/ICROIT.2014.6798341
  • [4] Do Bugs Foreshadow Vulnerabilities? A Study of the Chromium Project
    Camilo, Felivel
    Meneely, Andrew
    Nagappan, Meiyappan
    [J]. 12TH WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2015), 2015, : 269 - 279
  • [5] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [6] An empirical study of the integration time of fixed issues
    da Costa, Daniel Alencar
    McIntosh, Shane
    Kulesza, Uira
    Hassan, Ahmed E.
    Abebe, Surafel Lemma
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2018, 23 (01) : 334 - 383
  • [7] TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones
    Enck, William
    Gilbert, Peter
    Han, Seungyeop
    Tendulkar, Vasant
    Chun, Byung-Gon
    Cox, Landon P.
    Jung, Jaeyeon
    McDaniel, Patrick
    Sheth, Anmol N.
    [J]. ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2014, 32 (02):
  • [8] Gegick Michael, 2010, Proceedings of the 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), P11, DOI 10.1109/MSR.2010.5463340
  • [9] Haller I., 2013, Proceedings of the 22Nd USENIX Conference on Security, P49
  • [10] ReDeBug: Finding Unpatched Code Clones in Entire OS Distributions
    Jang, Jiyong
    Agrawal, Abeer
    Brumley, David
    [J]. 2012 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2012, : 48 - 62