Evolutionary Cost-Sensitive Ensemble for Malware Detection

被引：4

作者：

Krawczyk, Bartosz ^{[1
]}

Wozniak, Michal ^{[1
]}

机构：

[1] Wroclaw Univ Technol, Dept Syst & Comp Networks, PL-50370 Wroclaw, Poland

来源：

INTERNATIONAL JOINT CONFERENCE SOCO'14-CISIS'14-ICEUTE'14 | 2014年 / 299卷

关键词：

machine learning; classifier ensemble; multiple classifier system; imbalanced classification; cost-sensitive; malware detection; IMBALANCED DATA; MINORITY CLASS; CLASSIFICATION;

D O I：

10.1007/978-3-319-07995-0_43

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Malware detection is among the most extensively developed areas for computer security. Unauthorized, malicious software can cause expensive damage to both private users and companies. It can destroy the computer, breach the privacy of user and result in loss of valuable data. The amount of data uploaded and downloaded each day makes almost impossible for manual screening of each incoming software package. That is why there is a need for effective intelligent filters, that can automatically dichotomize between the safe and dangerous applications. The number of malware programs, that are faced by the detection system, is typically much smaller than the number of desired programs. Therefore, we have to deal with the imbalanced classification problem, in which standard classification algorithms tend to fail. In this paper, we present a novel ensemble, based on cost-sensitive decision trees. Individual classifiers are constructed according to an established cost matrix and trained on random feature subspaces to ensure, that they are mutually complementary. Instead of using a fixed cost matrix we derive its parameters via ROC analysis. An evolutionary algorithm is being applied for simultaneous classifier selection and assignment of committee member weights for the fusion process. Experimental analysis, carried out on a large malware dataset, prove that our method is capable of outperforming other state-of-the-art algorithms, and hence is an effective approach for the problem of imbalanced malware detection.

引用

页码：433 / 442

页数：10

共 22 条

[1] An exemplar-based learning approach for detection and classification of malicious network streams in honeynets
Abbasi, Fahim H.
Harris, Richard
Marsland, Stephen
Moretti, Giovanni
[J]. SECURITY AND COMMUNICATION NETWORKS, 2014, 7 (02) : 352 - 364
[2] Combined 5 x 2 cv F test for comparing supervised classification learning algorithms
Alpaydin, E
[J]. NEURAL COMPUTATION, 1999, 11 (08) : 1885 - 1892
[3] [Anonymous], 1984, OLSHEN STONE CLASSIF, DOI 10.2307/2530946
[4] Blaszczynski J, 2010, LECT NOTES ARTIF INT, V6086, P148, DOI 10.1007/978-3-642-13529-3_17
[5] SMOTEBoost: Improving prediction of the minority class in boosting
Chawla, NV
Lazarevic, A
Hall, LO
Bowyer, KW
[J]. KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2003, PROCEEDINGS, 2003, 2838 : 107 - 119
[6] An introduction to ROC analysis
Fawcett, Tom
[J]. PATTERN RECOGNITION LETTERS, 2006, 27 (08) : 861 - 874
[7] Ho TK, 1998, IEEE T PATTERN ANAL, V20, P832, DOI 10.1109/34.709601
[8] Krawczyk B., 2012, 2012 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI), P507, DOI 10.1109/BHI.2012.6211629
[9] Cost-sensitive decision tree ensembles for effective imbalanced classification
Krawczyk, Bartosz
Wozniak, Michal
Schaefer, Gerald
[J]. APPLIED SOFT COMPUTING, 2014, 14 : 554 - 562
[10] Ling C.X., 2004, ICML, P544

← 1 2 3 →