Malware Classification Using Deep Learning Methods

被引:39
作者
Cakir, Bugra [1 ]
Dogdu, Erdogan [2 ]
机构
[1] BearTell Inc, Ankara, Turkey
[2] Cankaya Univ, Ankara, Turkey
来源
ACMSE '18: PROCEEDINGS OF THE ACMSE 2018 CONFERENCE | 2018年
关键词
Machine learning; deep learning; supervised learning; classification; malware detection;
D O I
10.1145/3190645.3190692
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Malware, short for Malicious Software, is growing continuously in numbers and sophistication as our digital world continuous to grow. It is a very serious problem and many efforts are devoted to malware detection in today's cybersecurity world. Many machine learning algorithms are used for the automatic detection of malware in recent years. Most recently, deep learning is being used with better performance. Deep learning models are shown to work much better in the analysis of long sequences of system calls. In this paper a shallow deep learning-based feature extraction method (word2vec) is used for representing any given malware based on its opcodes. Gradient Boosting algorithm is used for the classification task. Then, k-fold cross-validation is used to validate the model performance without sacrificing a validation split. Evaluation results show up to 96% accuracy with limited sample data.
引用
收藏
页数:5
相关论文
共 18 条
[1]  
Christodorescu M, 2003, USENIX ASSOCIATION PROCEEDINGS OF THE 12TH USENIX SECURITY SYMPOSIUM, P169
[2]  
Dahl GE, 2013, INT CONF ACOUST SPEE, P3422, DOI 10.1109/ICASSP.2013.6638293
[3]   Polymorphic Malware Detection Using Sequence Classification Methods [J].
Drew, Jake ;
Moore, Tyler ;
Hahsler, Michael .
2016 IEEE SYMPOSIUM ON SECURITY AND PRIVACY WORKSHOPS (SPW 2016), 2016, :81-87
[4]   Greedy function approximation: A gradient boosting machine [J].
Friedman, JH .
ANNALS OF STATISTICS, 2001, 29 (05) :1189-1232
[5]  
IDA, 2013, ID DIS DEB
[6]  
Mikolov T., 2013, P 26 INT C NEURAL IN, P3111
[7]  
Pascanu R, 2015, INT CONF ACOUST SPEE, P1916, DOI 10.1109/ICASSP.2015.7178304
[8]  
Popov I, 2017, 2017 SIBERIAN SYMPOSIUM ON DATA SCIENCE AND ENGINEERING (SSDSE), P1, DOI 10.1109/SSDSE.2017.8071952
[9]   Opcode sequences as representation of executables for data-mining-based unknown malware detection [J].
Santos, Igor ;
Brezo, Felix ;
Ugarte-Pedrero, Xabier ;
Bringas, Pablo G. .
INFORMATION SCIENCES, 2013, 231 :64-82
[10]  
Saxe J, 2015, 2015 10TH INTERNATIONAL CONFERENCE ON MALICIOUS AND UNWANTED SOFTWARE (MALWARE), P11, DOI 10.1109/MALWARE.2015.7413680