A tree-based machine learning methodology to automatically classify software vulnerabilities

被引：4

作者：

Aivatoglou, Georgios ^{[1
]}

Anastasiadis, Mike ^{[1
]}

Spanos, Georgios ^{[1
]}

Voulgaridis, Antonis ^{[1
]}

Votis, Konstantinos ^{[1
]}

Tzovaras, Dimitrios ^{[1
]}

机构：

[1] Informat Technol Inst, Ctr Res & Technol Hellas, Thessaloniki, Greece

来源：

PROCEEDINGS OF THE 2021 IEEE INTERNATIONAL CONFERENCE ON CYBER SECURITY AND RESILIENCE (IEEE CSR) | 2021年

基金：

欧盟地平线“2020”;

关键词：

Software Vulnerability categorization; Cyber-security; Machine Learning; Decision Trees; Random Forests; Gradient Boosting;

D O I：

10.1109/CSR51186.2021.9527965

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Software vulnerabilities have become a major problem for the security analysts, since the number of new vulnerabilities is constantly growing. Thus, there was a need for a categorization system, in order to group and handle these vulnerabilities in a more efficient way. Hence, the MITRE corporation introduced the Common Weakness Enumeration that is a list of the most common software and hardware vulnerabilities. However, the manual task of understanding and analyzing new vulnerabilities by security experts, is a very slow and exhausting process. For this reason, a new automated classification methodology is introduced in this paper, based on the vulnerability textual descriptions from National Vulnerability Database. The proposed methodology, combines textual analysis and tree-based machine learning techniques in order to classify vulnerabilities automatically. The results of the experiments showed that the proposed methodology performed pretty well achieving an overall accuracy close to 80%.

引用

页码：312 / 317

页数：6

共 20 条

[1]

Aghaei Ehsan, 2020, ARXIV PREPRINT ARXIV

[2]

[Anonymous], 2016, P INT C BROADBAND WI

[3]

[Anonymous], 2021, IEEE Trans. Broadcast.

[4] Automation of Vulnerability Classification from its Description using Machine Learning [J].

Aota, Masaki ;

Kanehara, Hideaki ;

Kubo, Masaki ;

Murata, Noboru ;

Sun, Bo ;

Takahashi, Takeshi .

2020 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (ISCC), 2020, :26-32

[5]

Breiman L, 1996, MACH LEARN, V24, P123, DOI 10.1023/A:1018054314350

[6] XGBoost: A Scalable Tree Boosting System [J].

Chen, Tianqi ;

Guestrin, Carlos .

KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794

[7] Natural language processing [J].

Chowdhury, GG .

ANNUAL REVIEW OF INFORMATION SCIENCE AND TECHNOLOGY, 2003, 37 :51-89

[8] Explaining Explanations: An Overview of Interpretability of Machine Learning [J].

Gilpin, Leilani H. ;

Bau, David ;

Yuan, Ben Z. ;

Bajwa, Ayesha ;

Specter, Michael ;

Kagal, Lalana .

2018 IEEE 5TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2018, :80-89

[9] Application of interpretable machine learning models for the intelligent decision [J].

Li, Yawen ;

Yang, Liu ;

Yang, Bohan ;

Wang, Ning ;

Wu, Tian .

NEUROCOMPUTING, 2019, 333 :273-283

[10]

Liu CH, 2012, LECT NOTES ARTIF INT, V7197, P274, DOI 10.1007/978-3-642-28490-8_29

← 1 2 →